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INHERITED AND SOMATIC MUTATIONS OF 
APC GENE IN COLORECTAL CANCER OF HUMANS 

The U^- Government has a paid-up license in this invention and 
the right in limited circumstances to require the patent owner to 
license others on reasonable terms as provided for by the terms of 
grants awarded by the National Institutes ol Health. 
TECHNICAL AREA OF THE nr/£N7I0N 

The invention relates to the area of cancer diagnostics and ther- 
apeutics. More particularly, the invention relates to detection of the 
germline and somatic alterations of wild-type APC genes. In addition, 
it relates to therapeutic intervention to restore the function of APC 
gene pnxluct, 

BACKGROUND OF THE INVENTION 

According to the model of Knudlson for tumorigenesis (Cancer 
Research, Vol, 45» p. 1482. 1985), there are tumor suppressor genes in 
all normal cells which, when they become non-functional due to muta- 
tion, cause neoplastic development. Evidence for this model has been 
found in the cases of retinoblastoma and colorectal tumors. The impli- 
cated suppressor genes in those tumors, RB, p53» DCC and MCC, were 
found to be deleted or altered in many cases of the tumors studied, 
(Hansen and Cavenee. Cancer Research, Vol.. 47 pp. 5518-5527 (198T); 
Baker et ai.. Science, Vol.. 244. p. 217 (1989); Fearon et al.. Science, 
Vol. 247, p. 49 (1990;; Kiniler et al. Science Vol. 251. p. 1366 (1991).) 

m order to lully understand the pathogenesis of tumors, it will 
be necessary to identify the other suppressor genes that play a role in 
the tumorigenesis process. Prominent among these is the one(s) pre- 
sumptively located at Sq21, Cytogenetic (Herrera et al.. Am J. Med. 
Genet, , vol. 25. p. 473 ii?S6) and linkage (Leppcrt et al.. Science, Vol. 
238, p, 1411 (1987); Bodmer et al.. Nature, Vol. 328, p. 614 (1987)) Stud- 
ies have shown that this chromosome region hartors the gene 
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responsible for lamllial adenomatous polyposii (FA?) and Gardner's 
syndrome (GS). FAP is an autosomal-dominant. innertted disease In 
which affected individuals develop hundreds to thousands of 
adenomatous polyps, some of which progress to malignancy. GS is a 
variant of FAP In which desmoid tumors, osteomas and other soft tissue 
tumors occur together with multiple adenomas of the colon and rec- 
tum A less severe form of polyposis has been identified in which only 
a few (2-40) polyps develop. This condition also is familial and is linked 
to the same chromosomal markers as FAP and GS (Leppert et aL. New 
Eneiand Journal of Medicine. Vol. 322. pp. 904-908. 1990.) Additionally, 
this chromosomal region Is often deleted from the adenomas 
(Vogelstein et al.. N. Engl. J. Med.. Vol. 319. p. 525 (1988)) and carcino- 
mas (VogelStexn et al.. N. Engl. J. Med.. Vol. 319. p. 525 (1988): Solomon 
et al.. Nature, Vol. 328. p. 616 (1987); Sasaki et al.. Cancer Research, 
vol 49, p. 4402 (1989); Delattre et al.. Lancet. Vol. 2, p. 353 (19S9): and 
Ashton^Rickardt et al.. Oncogene. Vol. 4. p. 1169 (1989)) of patients 
without FAP (sporadic tumors). Thus, a putative suppressor gene on 
Chromosome 5q2l appears to play a rale in the early Stages of 
colorectal neoplasia in both sporadic and familial tumors. 

Although the MCC gene has been identified on 5q21 as a candi- 
date suppressor gene. It does not appear to be altered in FAP or GS 
patients. Thus there is a need in the art for investigations of this chro- 
mosomal region to identify genes and to determine If any of such genes 
are associated with FAP and/or GS and the process of tumorigenesis. 

■gnMMARY p F THE INVENTION 

It is an object of the present invention to provide a method for 
diagnosing and prognosing a neoplastic tissue of a human. 

It Is another object of the invention to provide a method of 
detecting genetic predisposition to cancer. 

It is another object of the invention to provide a method of sup- 
plying wild-type APC gene function to a cell which has lost said gene 
function . 

It is yet another object of the invention to provide a kit for 
determinauon of the nucleotide sequence of APC aUeles by the 
polymerase chain reaction. 
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It is Still another object of the invention to provide nucleic acid 
probes lor detection of mutations in tne human A PC fene. 

It is still another object of the invention to provide a cDNA mol- 
ecule encoding the APC gene product. 

It is yet another object of the invention to provide a preparation 
of the human AFC protein. 

It is another object of the Invention to provide a method of 
screening for genetic predisposition to cancer. 

It is an object of the invention to provide methods of testing 
therapeutic agents lor the ability to suppress neoplasia. 

It is still another object of the invention to provide animals car- 
rying mutant APC alleles. 

These and other objects of the invention are provided by one or 
more of the embodiments which are described below. In one embodi- 
ment of the present invention a method of diagnosing or prognosing a 
neoplastic tissue of a human is provided comprising: detecting somatic 
alteration of wild-type APC genes or their expression products in a 
sporadic colorectal cancer tissue, said alteration indicating neoplasia of 
the tissue. 

In yet another embodiment a method is provided of detecting 
genetic predisposition to cancer in a human including familial 
adenomatous polyposis (FAP) and Gardner's Syndrome (GS), comprising: 
Isolating a human sample selected from the group consisting or blood 
and fetal tissue; detecting alteration of wild-type APC gene coding 
sequences or their expression products from the sample, said alteration 
Indicating genetic predisposition to cancer. 

In another embodiment of the present Invention a method is 
provided for supplying wild-type APC gene function to a cell which has 
lost said gene function by virtue of a mutation in the APC gene* com- 
prising: introducing a wild-type APC gene into a cell which has lost 
said gene function such that said wild-type gene is expressed in the 
celL 

In another embodiment a method of supplying wild-type APC 
gene function to a cell is provided comprising: introducing a portion of 
a wild-type APC gene into a cell which has lost said gene function such 
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that said portion is expressed in the cell, safdporiion encoding a part 
of the APC protein which is required for non-neoplastic grovfXh of said 
ceU. APC protein can also be appUed to cells or administered to ani- 
mals to remediate for mutant APC genes. Synthetic peptides or drugs 
can also be used to mimic APC XuncUon in cells which have altered 
APC expression. 

In yet another embodiment a pair of single stranded primers Is 
provided for determination of the nucleotide sequence of the APC gene 
by polymerase chain reaction. The sequence of said pair of single 
stranded DNA prlmet? is derived from chromosome 5q band 21. said 
pair of primers aUowing synthesis of APC gene coding sequences. 

In still another embodiment of the invention a nucleic acid probe . 
is provided which is complementary to human wild-type APC gene cod- 
ing sequences and which can form mismatches wth mutant APC genes, 
thereby aUowlng their detection by enzymatic or chemical cleavage or 
by shifts in electrophoretie mobility. 

In another embodiment of the invention a method is provided for 
detecting the presence of a neoplastic tissue in a human. The method 
comprises Isolating a body sample from a human; detecting In said sam- 
ple alteration of a wild-type APC gene sequence or wild-type APC 
expression product, said alteration Indicating the presence of a 
neoplastic tissue in the human. 

In still another embodiment a cDNA molecule is provided which 
comprises the coding sequence of the APC gene. 

In even another embodiment a preparation of the human APC 
protein is provided which is substantially free of other human proteins. 
The amino acid sequence of the protein is shown in Figure 3 or 7. 

In yet another embodiment of the invention a method is provided 
for screening for genetic predisposition to cancer, including familial 
adenomatous polyposis (FAP) and Gardner's Syndrome (GS). in a human. 
The method comprises: detecting among kindred persons the presence 
of a DNA polymorphism which is linked to a mutant APC allele In an 
Individual having a genetic predisposition to cancer, said kindred being 
genetically related to the individual, the presence of said polymorphism 
suggesting a predisposition to cancer. 
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In another embodiment of the invention a method of testing 
therapeutic agents for the ability to suppress a neoplasticaUy trans- 
formed phenotype is provided. The method comprises: applying a test 
substance to a cultured epithelial cell which carries a mutation in an 
APC allele; and determining whether said test sut)fitance suppresses 
the neoplasticaUy transformed phenotype of the cell. 

In another embodiment of the invention a methcxl of testing 
therapeutic agents for the ability to suppress a neoplasticaUy trans- 
formed phenotype is provided. The method comprises: administering a 
test sutstance to an animal which carries a mutant APC allele: and 
determining whether said test substance prevents or suppresses the 
growth of tumors. 

In still other embodiments of the Invention transgenic animals 
are provided. The animals carry a mutant APC allele from a second 
animal species or have been genetically engineered to contain an inser- 
tion mutation which disrupts an APC allele. 

The present invention provides the art with the information that 
the APC gene, a heretofore unknown gene is. in tact, a target of muta* 
tjonal alterations on chromosome 5q2l and that these alterations are 
associated with the process of tumorigenesis. This information allows 
highly specific assays to be performed to assess the neoplastic status of 
a particular tissue or the predisposition to cancer of an individual. This 
invention has applicability to FamUial Adenomatous Polyposis, sporadic 
colorectal cancers, Gardner's Syndrome, as well as the less severe 
familial polyposis discusses above. 
BRIEF DESCRIPTION 0- THE DRAWINGS 

Figure lA shows an overview of yeast artificial chromosome 
(YAC) contigs. Genetic distances between selected RFLP markers 
from within the contigs are shown In centiMorgans. 

Figure IB shows a detailed map oi the three central contigs. 
The position of the sLx identified genes from within the FAP region is 
shown: the 5' and 3' ends of the transcripts from these genes have In 
general not yet been isolated, as Indicated by the string of dots sur- 
rounding the bars denoting the genes' positions. Selected restriction 
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endonudease recognition sites are indicated.- B, BssH2: S. Ss:n; 

M, Mlul; N, Nrul. 

Figure 2 shows the sequence of TBI and TB2 genes. The cDNA 
sequence of the TBI gene was determined from the analysis of 11 
cDNA clones derived from normal colon and Uver, as described in the 
text. A total of 2S14 bp were contained within the overlapping cDKA 
clones, defining an ORF of 424 amino acids beginning at nucleotide 1. 
only the predicted amino acids from the ORF are shown. The 
carboxy-terminal end of the ORF has apparenUy been identified, but 
the 5' and of the TBI transcript has not yet been precisely determined. 

The cDNA sequence of the TB2 gene was determined from the 
YS-39 clone derived as described in the text. This clone consisted of 
2300 bp and defined an ORF of 185 amino acids beginning at nucleotide 
I. Only the predicted amino acids are shown. The carboxy terminal 
end of the ORF has apparently been identified, but the 5' end of the 
TB2 transcript has not been precisely determined. 

Figure 3 shows the sequence of the APC gene product. The 
CDNA sequence was determined through the analysis of 87 cDNA clones 
derived from normal colon, liver, and brain. A total of 8973 bp were 
contained within overlapping tfDNA clones, defining an ORF of 2842 
amino acids. In frame stop codons surrounded this ORF. as described In 
the text, suggesting that the entire APC gene product was represented 
in the ORF Illustrated. Only the predicted amino acids are shown. 

F^re 4 shows the local similarity between human APC and ral2 
of yeast. Local sUnilartiy among the APC and MCC genes and the m3 
muscarinic acetylcholine receptor is shown. The region of the mAChR 
shown corresponds to that responsible for coupling the receptor to G 
proteins. The connecting lines indicate identities; dots indicate related 
amino acids residues. 

Figure 5 shows the genomic map of the 1200 kb NotI fragment at 
the FAP locus. The NotI fragment is shown as a bold line. Relevant 
parts of the deletion chromosomes from patients 3214 and 3824 are 
shown as stippled lines. Probes used to characterize the NotI fragment 
and the deletions, and three YACs from which subclones were obtained, 
are shown below the restriction map. The chimeric end of YAC 
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183H12 is indicated by a dotted line. The orientation and approximate 
position of MCC are indicated atx)ve the map. 

Fi^re 6 shows the DNA sequence and predicted amino acid 
sequence of DPI (TB2). The nucleotide numbering begins at the most 5' 
nucleotide isolated. A proposed initiation methionine (base 77) is indi* 
cated in bold type. The entire coding sequence is presented. 

Figure 7 shows the cDNA and predicted amino acid sequence of 
DP2.5 (APC). The nucleotide numbering begins at the proposed initia- 
tion methionine. The nucleotides and amino acids of the alternatively 
spliced exon (exon 9; nucleotide positions 934*1236) are presented in 
lower case letters. At the 3* end, a po]y(A) addition signal occurs at 
9530, and one cDNA clone has a poly(A) at 956S. Other cDKA clones 
extend beyond 9563, however, and their consensus sequence is included 
here. 

Figure 8 shows the arrangement of exons in DP2.5 (APC). 
(A) Exon 9 corresponds to nucleotides 933-1312; exon 9a corresponds to 
nucleotides 1236*1312. The stop codon in the cDNA Is at nucleotide 
8535. (B) Partial intronic sequence surrounding each exon is shown. 
DETAELEP DESCRIPTION 

It is a discovery of the present invention that mutational events 
associated with tumorigenesis occur in a previously unknown gene on 
chromosome 5q named here the APC (Adenomatous Polyposis Coll) 
gene. Although it was previously known that deletion of alleles on 
chromosome 5q were common In certain types of cancers, it was not 
known that a target gene of these deletions was the APC gene. Fui^ 
ther it was not known that other types of mutational events In the APC 
gene are also associated with cancers. The mutations of the APC gene 
can involve gross rearrangements, such as insertions and deletions. 
Point mutations have also been otiserveO. 

According to the diagnostic and prognostic method of the 
present invention, alteration of the wild-type APC gene is detected. 
"Alteration of a wlld-iype gene'* according to the present invention 
encompasses all forms of mutations — including deletions. The alter- 
ation may be due to either rearrangements such as Insertions, inver- 
sions, and deletions, or to point mutations. Deletions may be of the 



PCr/US«/003/6 

wo 92/13103 

-8- 



entlre gene or only a portion of the gene. SomaTic mutations are those 
which occur only in certain tissues, e^-, in the tumor tissue, and are 
not inherited in the germiine. Germline mutations can be found in any 
of a body's tissues. If only a single allele is somatically mutated, an f 
early neoplastic state is Indicated. However, if both aUeles are ♦ 
muuted then a late neoplastic state is indicated. The finding of APC 
mutations thus provides both diagnostic and prognostic information. 
An APC allele which is not deleted (e^^ that on the sister chromosome 
to a chromosome carrying an APC deletion) can be screened for other 
mutations, such as insertions, small deletions, and point mutaUons. It 
is beUeved that many mutations found in tumor tissues will be those 
leading to decreased expression of the APC gene product. However. . 
mutations leading to non-functional gene products would also lead to a 
cancerous state. Point mutational events may occur in regulatory 
regions, such as in the promoter of the gene, leading to loss or diminu- 
tion of egression of the mRNA. Point mutations may also abolish 
proper RNA processing, laading to loss of expression of the APC gene 
product. 

In order to detect the alteration of the void-type APC gene in a 
tissue, it is helpful to isolate tlie tissue free from surrounding normal 
tissues. Means for enriching a tissue preparation for tumor cells are 
known in the art. For example, the tissue may be isolated from paraf- 
fin or cryostat sections. Cancer cells may also be separated from nor- 
mal cells by flow cytometry. These as weU as other techniques for 
separating tumor from normal cells are well known in the art. If the 
tumor tlsue is highly contaminated with nc.mal cells, detection of 
mutations is more difficult. 

Detection of point mutations may be accomplished by molecular 
cloning of the APC allele (or aUeles) and sequencing that allele(s) using 
techniques well known in the art. Alternatively, the polymerase chain 
reaction (PCR) can be used to amplify gene sequences direcUy from a ^ 
genomic DNA preparation from the tumor tissue. Tne DNA sequence 
of the ampUfied sequences can then be determined. The polymerase 
chain reaction Itself Is weU known in the art. See, e.g., Saiki et al., 
Science. Vol. 239, p. 4S7. 1988; U.S. 4,583,203; and U.S. 4,663.195. 
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Specific primers which can be used in order io amplify the gene win 
be discussed in more detail below. The lipase chain reaction* which is 
known In the art, can also be used to amplify APC sequences. See Wu 
et al., Genomics . Vol. 4, pp. 560-569 (1989). In addition, a technique 
known as allele specific PCR can be used. (See Ruano and Kidd» 
Nucleic Acids Research, Vol. 17. p. 8392, 1989.) According to this 
technique, primers are used which hybridize at their 3' ends to a par- 
ticular APC mutation. If the particular APC mutation is not present, 
an amplification product is not observed. Amplification Refractory 
Mutation System (ARMS) can also be used as discteed in European 
Patent Application Publication No. 0332435 and in Newton et al., 
Nucleic Acids Research, Vol. 17, p.7, 1989. Insertions and deletions of 
genes can also be detected by cloning, sequencing and amplification. In 
addition, restriction fragment length polymorjAism (RFLP) probes for 
the gene or surrounding marker genes can be used to score alteration 
of an allele or an insertion in a polymorf^e fragment. Such a method 
is particularly useful for screening among kindred persons of an 
affected Individual for the presence of the APC mutation found in that 
individual. Single stranded conformation polymorphism (SSCP) analysis 
can also be used to detect base change variants of an allele. (Orita et 
al., Proc. Natl. Acad. Sci. USA Vol. 86, pp. 2766-2770, 1989, and 
Genomics, Vol. 5, pp. 874-879, 1989.) Other techniques for detecting 
insertions and deletions as are known in the art can be used. 

Alteration of wllchtype genes can also be detected on the basis 
of the alteration of a wild-iype expression product of the gene. Such 
expression products include both the APC mRNA as well as the APV 
protein product. The sequences of these products are shown in 
Figures 3 and 7. Point mutations may be detected by amplifying and 
sequencing the mRNA or via molecular cloning of cDNA made from the 
mRNA. The sequence of the cloned cDNA can be determined using 
DNA sequencing techniques which are well known in the art. The 
cDNA can also be sequenced via the polymerase chain rcdCtion (PCR) 
which will be discussed in more detail below. 

Mismatches, according to the present invention are hybridized 
nucleic acid duplexes which are not 100% homologous. The lack ol 
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total homology may be due to deletions, insertions, inversions, substitu- 
tions or irameshift mutations. Mismatch detection can be used to 
detect point mutations in the gene or its mRKA product. While these 
techniques are less sensitive than sequencing, they are simpler to per- 
form on a large number ol tumor samples. An example oi a mismatch 
cleavage technique is the RNase protecuon method, which is described 
to detail in Winter et al.. Proc. NatL Acad. Sei. USA. Vol. 82. p. 7575. 
1985 and Meyers et aL, Science. VoL 230. ^ 1242. 1985. In the pracUce 
of the present invention the method involves the use of a labeled 
riboprobe which is complementary to the human wUd-type APC gene 
coding sequence. The riboprobe and either mRNA or DNA isolated 
from the tumor tissue are annealed (hybridized) together and subse- 
quenUy digested with the enzyme RNase A which is able to detect 
some mismatches in a duplex RNA strucnire. If a mismatch is detected 
by RNase A. l: cleaves at the site of the mismatch. Thus, when the 
annealed RNA preparation is separated on an electrophoretic gel 
matrix, if a mismatch has been detected and cleaved by RNase A, an 
RNA product wiu be seen which Is smaUer than the full-length duplex 
RNA for the riboprobe and the mRNA or DNA. The riboprobe need not 
be the fuU length of the APC mRNA or gene but can be a segment of 
either. If the riboprobe comprises only a segment of the APC mRNA or 
gene it will be desirable to use a number of these probes to screen the 
whole mRNA sequence for mismatches. 

m similar fashion. DNA probes can be used to detect mis- 
matches, through enzymatic or chemical cleavage. See, e.g.. Cotton et 
al.. Proc. Natl. Acad. Scl. USA, Vol. 85, 4397, 1988: and Shenk et al., 
Proc. NaU. Acad. Sci. USA. vol. 72, p. 989, 1973. Alternatively, mis- 
matches can be detected by shifts in the electrophoretic mobility of 
mismatched duplexes relative to matched duplexes. See. e.g., CarieUo, 
Human Genetics. Vol. 42, p. 726, 1988. With either tiboprobes or DNA 
probes, the ceUular mRNA or DNA which might contain a mutaUon can 
be amplified using PCR (see below) before hybridization. Changes In 
DNA of the APC gene can also be detected using Southern hybridiza- 
tion, especially if the changes are gross rearrangements, such as dele- 
tions and insertions. 
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DNA sequences of the A PC gene which-have been amplified by 
use ol polymerase chain reaction may also be screened using allele-spe- 
clflc probes. These probes are nucleic acid oligomers, each of which 
contains a region of the APC gene sequence harboring a knovm muta- 
tion. For example, one oligomer may be about SB nucleotides in length, 
corresponding to a portion of the APC gene sequence. By use of a bat- 
tery of such allele-specific probes, PCR amplification products can be 
screened to Identify the presence of a previously identified mutation in 
the APC gene. Hybridization of allele-specific probes with amplified 
APC sequences can be performed, for example, on a nylon filter. 
Hybridization to a particular probe under stringent hybridization condi- 
tions Indicates the presence of the same mutation in the tumor tissue 
as in the allele-specific probe. 

Alteration of APC mRNA expression can be detected by any 
technique known in the art. These include Northern blot analysis, PCR 
amplification and RNase protection. Diminished mRNA expression 
indicates an alteration of the wUd-type APC gene. 

Alteration of wild-type APC genes can also be detected by 
screening for alteration of vriid-type APC protein. For example, 
monoclonal antibodies immunoreactive with APC can be used to screen 
a tissue. Lack of cognate antigen would Indicate an APC mutation. 
Antibodies specific for products of mutant alleles could also be used to 
detect mutant APC gene product. Such immunological assays can be 
done in any convenient format known in the art. These include West- 
ern blots, immunohistochemlcal assays and ELISA assays. Any means 
for detecting an altered APC protein can be used to detect alteration 
of wild-type APC genes. Functional assays can be used, such as protein 
binding determinations. For example, it is believed that APC protein 
ollgomerizes to Itself and/or MCC protein or binds to e G protein. 
Thus, an assay for the ability to bind to wild type APC or MCC protein 
or that G protein can be employed. In addition, assays can be used 
which detect APC biochemical function. It is believed that APC is 
involved In phospholipid metabolism. Thus, assaying the enzymatic 
products of the involved phospholipid metabolic pathway can be used to 
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determine APC activity. Findtne a mutant APC gene product indicates 
alteration of a wiJd-type APC gene. 

Mutant APC genes or gene products can also be detected in 
other human body samples, such as. serum, stool, urine and sputum. 
The same techniques discussed above for detection of mutant APC 
genes or gene products in tissues can be appUed to other body samples. 
Cancer cells are sloughed off from tumors and appear in such body 
samples. In addition, the APC gene product itseU may be secreted Into 
the extraceUular space and found in these body samples even In the 
absence of cancer cells. By screening such body samples, a simple 
early diagnosis can be achieved for many types of cancers. In addition, 
the progress of chemotherapy or radiotherapy can be monitored more 
easay by testing such body samples for mutant APC genes or gene 
products. 

The methods of diagnosis of the present invention are applicable 
to any tumor in which APC has a role in tumorigenesis. Deletions of 
chromosome arm 5q have been observed in tumors of lung, breast, 
colon, rectum, bladder, liver, sarcomas, stomach and prostate, as well 
as in leukemias and lymphomas. . Thus these are likely to be tumors in 
which APC has a role. The cfiagnostic method of the present invenUon 
is useful for clinicians so that they can decide upon an appropriate 
course of treatment. For example, a tumor displaying alteration of 
both APC aUeles might suggest a more aggressive therapeutic regimen 
than a tumor displaying alteration of only one APC allele. 

The primer pairs of the present invention are useful for determi- 
nation of the nucleotide sequence of a particular APC aUele using the 
polymerase chain reaction. The pairs of single stranded DNA primers 
can be annealed to sequences within or surrounding the APC gene on 
chromosome Sq in order to prime amplifying DNA synthesis of the APC 
^e itself. A complete set of these primers aUows synthesis of aU of 
the nucleotides of the APC gene coding sequences, i.e.. the exons. The 
set of primers preferably allows synthesis of both intron and exon 
sequences. AUele specific primers can also be used. Sucn primers 
anneal only to particular APC mutant aUeles. and thus will only amplify 
a product in the presence of the mutant allele as a template. 
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In order to facilitate subsequent cloning of amplified sequences, 
primers may have restriction enzyme site sequences appended to their 
5' ends. Thus, all nucleotides of the primers are derived from APC 
{ sequences or sequences adjacent to APC except the few nucleotides 

necessary to form a restriction enzyme site. Such enzymes and sites 
arc well known in the art. The primers themselves can be synthesized 
using techniques which are well known in the art. Generally, the prim- 
exs can be made using oligonucleotide synthesizing machines which are 
commercially available. Given the sequence of the APC open reading 
frame shown m Figure 7, design of particular primers is wcU within the 
sklU of the art. 

The nucleic acid probes provided by the present Invention are 
useful for a number of purposes. They can be used in Southern hybrid- 
ization to genomic DNA and in the RNase protection method for 
detecting point mutations already discussed above. The probes can be 
used to detect PCR amplification products. They may also be used to 
detect mismatches with the APC gene or mRNA. using other tech- 
niques. Mismatches can be detected using either enzymes (e.g.. Si 
nuclease), chemicals (e-g., hydroxylamine or osmium tetroxide and 
piperldine), or change> in electrophoretic mobility of mismatched 
hybrids as compared to totally matched hybrids. These techniques are 
known in the art. See, Cotton, supra . Shenk, supra . Myers, supra . Win- 
ter, supra , and Novack et al., Proc. Natl, Acad. Sci. USA, Vol. 83, p. 
586, 1986. Generally, the probes are complementary to AFC gene coch 
ing sequences, although probes to certain introns are also contem- 
plated. An entire battery of nucleic acid {. -obes is used to compose a 
kit for detecting alteration of wild-type APC genes. The kit allows for 
hybridization to the entire APC gene. The probes may overlap with 
each other or be contiguous. 

If a riboprobe is used to detect mismatches with mRNA, it is 
complementary to the mRNA of the human wild- type APC gene. The 
riboprobe thus Is an antl"-sense probe in that it does not code lor the 
APC protein because it is of the opposite polarity to the sense strand. 
The riboprobe generally will be labeled with a radioactive, 
colorimetric, or fluorometric material, which can be accomplished by 
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any means known in the art. If the riDoprobe is used to detect mis- 
matches y^ixh DNA it can be ot either polarity, sense or anti-sense. 
Similarly, DNA probes also may be used to detect mismatches. 

Nucleic acid probes may also be complementary to mutant 
alleles of the A PC eene. These are useful to detect similar mutations 
in other patients on the basis of hybridization rather than mismatches. 
These are discussed above and referred to as allele-specific probes. As 
mentioned above, the APC probes can also be used In Southern hybrid- 
izations to genomic DNA to detect gross chromosomal changes such as 
deletions and insertions. The probes can also be used to select cDNA 
clones of APC genes from tumor and normal tissues. In addition, the 
probes can be used to detect APC mRNA in tissues to determine if 
expression is diminished as a result of alteration of wild-type APC 
genes. Provided with the APC coding sequence shown in Figure 7 (SEQ 
ID NO: 1), design of particular probes is well within the skill of the 
ordinary artisan. 

According to the present invention a method is also provided of 
supplying wUd-type APC function to a cell which carries mutant APC 
alleles. Supplying such function should suppress neoplastic growth of 
the recipient cells. The wUd-type APC gene or a part of the gene may 
be intnxiuced into the cell in a vector such that the gene remains 
extrachromosomal. In such a situation the gene will be expressed by 
the cell from the extrachromosomal location. If a gene portion is 
introduced and expressed in a cell carrying a mutant APC allele, the 
gene portion should encode a part of the APC protein which is required 
for non-neoplastic growth of the ceU. More preferred Is the situation 
where the wild-t>'pe APC gene or a part of it is introduced into the 
mutant cell in such a way thai it recombines with the endogenous 
mutant APC gene present in the cell. Such recombination requires a 
double recombination event which results in the correction of the APC 
gene mutation. Vectors for introduction of genes both for recombina- 
tion and for extrachromosomal maintenance are known in the art and 
any suitable vector may be used. Methods for introducing DKA into 
cells such as electroporaiion, calcium phosphate co-precipitation and 
viral transduction are known In the art and the choice cf method is 
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Within the competence of the routineer. Ceils transformed with the 
wild-type APC gene can be used as model systems to study cancer 
remission and drug treatments which promote such remission. 

Similarly, cells and animals which carry a mutant APC aJiele can 
be used as model systems to study and test for substances which have 
potential as therapeutic agents. The cells are typically cultured 
epithelial cells. These may be isolated from Individuals with APC 
mutations, either somatic or germline. Alternatively, the cell line can 
be engineered to carry the muution in the APC allele. After a test 
substance is applied to the cells, the neoplasticaUy transformed pheno- 
type of the cell will be determined. Any trait of neoplasticaUy trans- 
formed cells can be assessed, including anchorage-Independent growth, - 
tumorigenlcity in nude mice, invasiveness of cells, and growth factor 
dependence. Assays for each of these traits are known In the art. 

Animals for testing therapeutic agents can be selected after 
mutagenesis of whole animals or after treatment of germline cells or 
zygotes. Such treatments include Insertion of mutant APC alleles^ usu- 
ally from a second animal species, as well as insertion of disrupted 
homologous genes. Alternatively, the endogenous APC gene(s) of the 
animals may be disrupted by insertion or deletion mutation. After test 
substances have been administered to the animals, the growth of 
tumors must be assessed. If the test substance prevents or suppresses 
the growth of tumors, then the test substance is a candidate therapeu- 
tic agent for the treatment ol FAP and/or sporadic cancers. 

Polypeptides which have APC acUviiy can be supplied to ceUs 
which carry mutant or missing APC alleles. The sequence of the APC 
protein is disclosed in Figure 3 or 7 (S£Q ID K0:-7 or 1). These two 
sequences differ slightly and appear to be indicate the existence of two 
different forms of the APC protein. Protein can be produced by 
expression of the cDNA sequence in bacteria, for example, using known 
expression vectors. Alternatively, APC can be extracted from APC- 
producing mammalian cells such as brain cells. In addition, the tech- 
niques of synthetic chemistry can De employed to synthesize APC pro- 
tein. Any of such techniques can provide the preparation of the 
present invention which comprises the APC protein. The preparation 
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Is suhstantiaUy free of other human proteins. Tnls Is most readily 
accomplished by synthesis in a microorganism or in vitro. 

Acflve APC molecules can be introduced into cell5 by 
microinjection or by use of liposomes, for example. Alternatively, 
some such active molecules may be taken up by cells, actively or by 
diffusion. Extracellular appUcation of APC gene product may be suffi- 
cient to affect tumor growth. Supply o^ Mecules with APC activity 
should lead to a partial reversal of the neoplastic state. Other mole- 
cules with APC acUvity may also be used to effect such a reversal, for 
example peptides, drugs, or organic compounds. 

The present invention also provides a preparation of antibodies 
immunoreactive with a human APC protein. The anttoodies may be 
polyclonal or monoclonal and may be raised against native APC pro- 
tein, APC fusion proteins, or mutant APC proteins. The antibodies 
should be immunoreactive with APC epitopes, preferably epitopes not 
present on other human proteins. In a preferred embodiment of the 
invention the antibodies will immunoprecipitate APC proteins from 
soluUon as weU as react with APC protein on Western or immunoblots 
of polyacrylamide gels. In another preferred embodiment, the antibod- 
i^ will detect APC proteins in paraffin or frozen tissue sections, using 
Immunocytochemical techniques. Techniques for raising and purifying 
antibodies are weU known in the art and any such techniques may be 
chosen to achieve the preparation of the invention. 

Predisposition to cancers as in FAP and GS can be ascertained 
by testing any tissue ol a human for mutations of the APC gene. For 
example, a person who has inherited a germline APC mutation would be 
prone to develop cancers. This can be determined by testing DNA from 
any tissue of the person's body. Most simply, blood can be drawn and 
DNA extracted from the cells of the blood. In addition, prenatal diag- 
nosis can be accomplished by testing fetal cells, placental cells, or 
amniotic fluid for mutations of the APC gene. Alteration of a wild- 
type APC allele, whether for example, by point mutation or by dele- 
tion, can be detected by any of the means discussed above. 

Molecules of cDNA according to the present invention are 
intron-free, APC gene coding molecules. They can be made by reverse 
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transcriptase using the APC mRNA as a template. These molecuJes 
can be propagated in vectore and cell lines as is known in the art. Such 
molecules have the sequence shown in SEQ ID NO; 7. The cDNA can 
also be made using the techniques of synthetic chemistry given the 
sequence disclosed herein. 

A Short region ol homology has been identified between APC and 
the human m3 muscarinic acetylcholine receptor (mAChR). This 
homology was largely confined to 29 residues in which 6 out of 7 amino 
acids (EL(GorA)GLQA) were idfe -sal (See Figure 4). Initially, it was 
not known whether this homology s significant, because many other 
proteins had higher levels of global homology (though lew had six out of 
seven contiguous amino acids in common). However, a study on the 
sequence elements controlling G protein activation by mAChR subtypes 
(Lcchlelter et aL, EMBO J., p. 4381 (1990)) has shown that a 21 amino 
acid region from the mS mAChR completely mediated G protein speci- 
ficity when substituted for the 21 amino acids of m2 mAChR at the 
analogous protein position. These 21 residues overlap the 19 amino acid 
homology between APC and m3 mAChR. 

This connection between APC and the G protein activating 
region of mAChR is Intriguing in light of previous investigations relat- 
ing G proteins to cancer. For example, the RAS oncogenes, which are 
often mutated in colorectal cancers (Vogelstein* et ah. N. Engl. J. 
Med., VOL 319, p. 525 (1988); Bos et al.. Nature Vol. 327, p. 293 (1987)), 
are members of the G protein family (Bourne, et ah. Nature, Vol, 348, 
p. 125 (1990)) as is an in vitro transformation suppressor (Noda el al., 
Proc, NaU. Acad. t^i. USA, Vol. 86, p. 162 (1989)) and genes mutated in 
hormone producing tumors (Candis et al.. Nature, Vol. 340, p. 692 
(1989); Lyons et al.. Science, Vol. 249, p. 655 (1990)). Additionally, the 
gene responsible for neurofibromatosis (presumably a tumor suppressor 
gene) has been shown to activate the GTPase activity of RAS (Xu et aL, 
CeU, Vol. 63. p. 835 (1990); Martin et al.. Cell, Vol. 63, p. 843 (1990); 
BaUesier ei al,, Cell, Vol. 63, p. 851 (1990)). Another Interesting link 
between G proteins and colon cancer involves the drug sulindac. This 
agent has been shown to inhibit the growth of benign colon tumors in 
patients with FAP, presumably by virtue of its activity as a 
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cyclooxyfenase inhibitor (Waddell et al., J. Surg. Oncology 24(1), 83 
(1983); WadeU. et al., Am. J. Surg. 157{1), 175 (1989); Charneau ei al., 
Gastroenterologie Clinique at Biolofique 14(2), 153 (1990)). 
Cyclooxygenase is required to convert arachidonic acid to 
prostaglandins and other biologically active molecules. G proteins are 
known to regulate phospholipase A2 activity, which generates 
arachidonic acid from phospholipids (Role et al., Proc. NatL Acad. ScL 
USA, VOL 84. p. 3623 (1987): Kurachl et ah. Nature, VoL 337, 12 535 
(1989)). Therefore we propose that wU^-type APC protein functions by 
interacting with a G protein and involved In phosphoUpld 
metabolism. 

The f oUowing are provided for exemplification purposes only and 
are not intended to limit the scope of the invention which has been 
described in broad terms above. 
Example 1 ; 

This example demonstrates the Isolation or a 5.5 Mb region of 
human DNA linked lo the FA? locus. Six genes are identified in this 
region, aD of which are expressed in normal colon ceils and in 
colorectal, lung, ad bladder tumors. 

The cosmid markers TNS.64 and yN5.48 have previously been 
Shown to delimit an 8 cM region containing the locus for FAP 
(Nakamura et aL, Am. J. Hum. Genet. Vol. 43. p. 638 (1988)). Further 
linkage and pulse-field gel electrophoresis (PFGE) analysis with addi- 
tional markers has shown that the FAP locus is contained within a 4 cM 
region bordered by cosmlds EF5.44 and L5.99. In order to isolate clones 
representing a significant portion of this locus, a yeast artificial chro- 
mosome (YAC) library was screened with various 5q21 markers. 
Twenty-one YAC clones, distributed within six contigs and Including 
5,5 Mb from the region between YK5.64 and Y.M5.48. were obtained 
(Figure lA). 

Three contigs encompassing approximately 4Mb were contained 
within the central portion of this region. The YAC'S constituting these 
contigs. together with the markers used for their isolation and orienta- 
tions, are shown in Figure 1. These YAC contigs were obtained in the 
following way. To initiate each contig. the sequence of a genomic 



wo 92/13103 



-19- 



PCT/US92/00376 



marter cloned Irom chromosome 5q2l was determined and used to 
design primere for PCR. PCR was then carried out on pools of YAC 
clones distributed in microtiter trays as previously described (Anand 
et al., Nucleic Acids Research. Vol. 1«, p. 1951 (1980)). Individual YAC 
clones Irom the positive pools were identified by further PCR or 
hyDrtdlzatlon based assays, and the YAC sizes were determined by 
PFGE. 

To extend the areas covered by the original YAC clones, "chro- 
mosomal walking'* was performed. For this purpose, YAC termini were 
isolated by a PCR based method and sequenced (RUey et al.. Nucleic 
Acids Research. Vol, 18, p. 2887 (1990)). PCR primers based on these 
sequences were then used to rescreen the YAC Ubrary. For example, 
the sequence from an intron of the FER gene (Hao ei al., Mol. Cell. 
BioL, Vol. 9, p. 1587 (1989)) was used to design PCR primers for isola- 
tion of the 28EC1 and 5EH8 YACs. The termini of the 28EC1 YAC 
were sequenced to derive markers RHE28 and LHE28, respectively. 
The sequences of these two markers were then used to isolate YAC 
Clones 15CH12 (from RHE26) and 40CF1 and 29EF1 (from LHE28). 
These five YAC's formed a contig encompassing 1200 kb (con tig 1. 
Figure IB). 

Similarly, contig 2 was initialed using cosmid No. 66 sequences, 
and contig 3 was initiated using sequences both from the MCC gene and 
from cosmid EF5.44. A walk in the telomeric direction from YAC 
14FH1 and a walk in the opposite direction from YAC 39GG3 allowed 
connection of the initial contig 3 clones through YAC 37HG4 
(Figure IB). 

Multipoint linkage analysis with the various markers used lo 
define the contigs, combined with PFGE analysis, showed that contigs 1 
and 2 were centromeric to contig 3. These contigs were used as tools 
to orient and/or identify genes which might be responsible for FAP. 
Six genes were found to lie within this cluster of YAC*s, as follows: 

Contig #1: FER - The FER gene was discovered through its 
homology to the viral oncogene ABL (Hao et ai., supra ). It has an 
intrinsic tyrosine kinase activity, and in situ hybridization with an FER 
probe showed that the gene was located at 5qll-23 (Morris et al., 
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Cytoeenet. CeU. Genet.. Vol. 53. p. 4. (1990)). Because of the potential 
role or this oncogene-reUted gene in neoplasia, we decided to evaluate 
it further with regards to the FAP locus. A human genomic done from 
FER was isolated (MF 2.3) and used to define a restriction fragment 
length polymorphism (RFLP). and the RFLP in turn used to map FER by 
linkage analysis using a pane) of three generation lamiUes. This 
showed that FER was very tlghUy linked to previously defined 
polymorphic markers for the FAP locus. The genetic mapping of FER 
was complemented by physical mapping using the YAC clones derived 
from FER sequences (Figure IB). Analysis of YAC contig 1 showed that 
FER was wrtthin 600 kb of cosmld marker M5.28, which maps to within 
1.5 Mb of cosmid L5.99 by PFGE of human genomic DNA. Thus, the 
YAC mapping results were consistent with the FER linkage data and 
PFGE analyses. 

Contig 2: TBI - TBI was Identified through a cross-hybrldizatlon 
approach. Exons of genes are often evolutlonarily conserved whUe 
introns and intergenlc regions are much less conserved. Thus, if a 
human probe cross-hybridizes strongly to the DNA from non-primate 
species, there is a reasonable chance that it contains exon sequences. 
Subclones of the cosmids shown in Figure 1 were used to screen South- 
em blots containing rodent DNA samples. A subclone of cosmid NS.66 
(p 5.66-4) was shown to strongly hybridize to rodent DNA, and this 
clone was used to screen cDNA libraries derived from normal adult 
colon and fetal liver. The ends ol the initial cDNA clones obtained in 
this screen were then used to extend the cDNA sequence. Eventually, 
11 CDNA clones were isolated, covering 2314 bp. The gene detected by 
these clones was named TBI. Sequence analysis Of the overlapping 
ctones revealed an open reading frame (ORF) that extended for 1302 bp 
starting from the most 5' sequence data obtained (Figure 2A)- If this 
entire open reading frame were translated, it would encode 434 amino 
acids. The product of this gene was not globally homologous to any 
other sequence in the current database but showed two significant local 
similarities to a family ol ADP. ATP carrier/translocator proteins and 
mitochondrial brown fat uncoupUng proteins which are widely distrib- 
uted from yeast to mammals. These conserved regions of TBI 
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(underlined in Fifure 2A) may define a predictive moUf lor this 
sequence family. In addition, TBI appeared to contain a signal peptide 
(or mitochondrial targeting sequence) as well as at least 7 
transmembrane domains. 

Contig 3: MCC, TB2, SRP and APC • The MCC gene was also 
discovered through a cross-hybridization approach, as described previ- 
ously (Klnzler et ah. Science Vol. 251, p. 1366 (1991)). The MCC gene 
was considered a candidate for causing FAP by virtue of Its tight 
genetic linkage to FAP susceptibility and its somatic mutation in spo- 
radic colorectal carcinomas. However, mapping experiments suggested 
that the codir^ region of MCC was approximately 50 kb proximal to 
the centromeric end of a 200 kb deletion found in an FAP patient. . 
MCC cONA probes detected a 10 kb mRNA transcript on Northern blot 
analysis of vmich 4151 bp, including the entire open reading frame, 
have been cloned. Although the V non*translated portion or an alter- 
natively spliced form of MCC might have extended into this deletion, it 
was possible that the deletion did not affect the MCC gene product. 
We therefore used MCC sequences to Initiate a YAC contig, and subse- 
quently used the YAC clones to identify genes 50 to 250 kb distal to 
MCC that might be contained within the deletion. 

In a first approach, the Insert from YAC24ED6 (Figure IB) was 
radioiabelled and hybridized to a cDKA library from normal colon. One 
of the cDNA clones (YS39) identified in this manner detected a 3.1 kb 
mRNA transcript when used as a probe for Northern blot hybridization. 
Sequence analysis of the YSS9 clone revealed that it encompassed 2263 
nucleotides and contained an ORF that extended for 555 bp from the 
most 5* sequence data obtained. If all of this ORF were translated, it 
would encode 185 amino acids (Figure 2B). The gene detected by Y539 
was named TB2. Searches of nucleotide and protein databases revealed 
that the TB2 gene was not identical to any previously reported 
sequences nor were there any striking similarities. 

Another clone (YSll) identified through the YAC 24ED6 screen 
appeared to contain portions of two distinct genes. Sequences from 
one end of YSll were identical to at least 180 bp of the signal recogni- 
tion particle protein SRP19 (Lingelbach et al. Nucleic Acids Research. 
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vol. IS. p. 9431 (1988). A second ORF. from the opposite end of clone 
YSii proved to be identical to 78 bp of a novel gene which was inde- 
pendently identified through a second YAC-based approach. For the 
^ter. DNA from yeast cells containing YAC HFHl (Figure IB) was 
digested with EcoRI and subcloned into a plasmid vector. Plasmids that 
contained human DNA fragments were selected by colony hybridization 
using total human DNA as a probe. These clones were then used to 
search for cross-hybricfizing sequences as described above for TBI. and 
the cross-hybriifizing clones were subsequently used to screen cDNA 
libraries. One of the cDNA clones discovered in this way (FH38) con- 
tained a long ORF (2496 bp). 78 bp of which were identical to the 
above-noted sequences in YSll. The en* of the FH38 cDNA clone 
were then used to initiate cDNA walking to extend the sequence. 
Eventually. 85 cDNA clones were isolated from normal colon, brain and 
liver CDNA libraries and found to encompass 8973 nucleotides of con- 
tiguous transcript. The gene . corresponding to this transcript was 
named APC. When used as probes for Northern blot analysis. APC 
CDNA Clones hybridized to a single transcript of approximately 9.5 kb. 
suggesting that the great majority of the gene product was represented 
in the CDNA Clones obtained. Sequences from the 5' end of the APC 
gene were found in YAC 37HG4 but not in YAC HFHl. However, the 
3- end of the APC ^ene was found in 14FHI as weU as 37HG4. The 
yeast artificial chromosome of the present invention designated 
YAC 37HG4 has been deposited with the National Collection of Indus- 
trial and Marine Bacteria (NCIMB). P.O. Box 31. 135 Abbey Road. 
Aberdeen AB9 8DG. Scotland, prior to the filing of this patent applica- 
tion. The NCIMB Accession Number of YAC clone YAC 37HG4 is 
40353. Analogously, the 5' end of the MCC coding region was found in 
YAC clones 19AA9 and 26GC3 but not 24EDe or HFHl. while the 3' 
end displayed the opposite pattern. Thus, MCC and APC transcription 
units pointed in opposite directions, with the direction of transcription 
going from centromeric to telomeric in the case of MCC. and telomeric 
to centromeric in the case of APC. PFGE analysis of YAC DNA 
digested with various restriction endonucleases showed that TB2 and 
SRP were between MCC and APC. and that the 3' ends of the coding 
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regions of MCC and A PC were separated by approximately 150 kb 
(Figure IB), 

Sequence analysis of the AFC cDNA clones revealed an open 
reading frame of 8,535 nucleotides. The 5' end of the ORF contained a 
methionine codon (codon 1) that was preceded by an in-frame stop 
codon 9 bp upstream, and the 3» end was foUowed by several in-frame 
stop codons. The protein produced by initiation at codon 1 would con- 
tain 2,842 amino acids (Figure 3). The results of database searching 
with the AFC gene product were quite complex due to the presence of 
large segments with locally biased amino acid compositions. In spite of 
this, AFC could be roughly divided into two domains. The N-terminal 
25% of the protein had a high content of leucine residues (12%) and 
showed local sequence similarities to myosins, various intermediate 
filament proteins (e.g., desmin, vimentin, neurofilaments) and 
Drosophila armadillo/human plakoglobin. The latter protein is a com- 
ponent of adhesive junctions (desmosomes) joining epithelial cells 
(Franke et al., Froc. Natl. Acad. Sci. U.S.A., Vol. 8G. p. 4027 (1989); 
Perfer et al.. Cell, Vol. 63, p. 1167 (1990)) The C-terminal 75% of AFC 
(residues 731-2832) is 17% serine by composition with serine residues 
more or less uniformly distributed. This large domain also contains 
local concentrations of charged (mostly acidic) and proline residues. 
There was no indication of potential signal peptides, transmembrane 
regions, or nuclear targeting signals in A PC suggesting a cytoplasmic 
localization. 

To detect short similarities to AFC, a database search was per- 
formed using the PAM-40 matrix (Altschul. J. Mol. Bio., Vol. 219, p. 555 
(1991). Fotentially interesting matche5 to several proteins were found. 
The mast suggestive of these involved the ral2 gene product of yeast, 
which is implicated in the regulation of ras activity (Fukul et ah, Mol. 
Cell. Biol., Vol. 9, p. 5617 (1989)). Little is known about how ral2 might 
interact with ras but it is interesting to note the positively-charged 
character of this region in the context of the negatively-charged GAP 
interaction region of ras. A specific electrostatic interaction between 
ras and GAP-relaied proteins has been proposed. 
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Because of the proximity of the MCC and APC genes, and the 
fact that both are ImpUcated in colorectal tumorigenesis. we searched 
for Similarities between the two predicted proteins. Bourne has previ- 
ously noted that MCC has the potential to form alpha heUcal coUed 
colls (Nature, vol. 351. p. 188 (1991). Lupas and coUeagues have 
recently developed a program for predicting coiled coil potential from 
primary sequence data (Science. Vol. 252. p. 1162 (1991) and we have 
used their program to analyze both MCC and APC. Analysis of MCC 
indicated a discontinuous pattern of coiled-coU domains separated by 
putative "hinge" or "spacer" regions simUar to those seen in laminln 
and other intermediate filament proteins. Analysis of the APC 
sequence revealed two regions in the N-terminal domain which had 
strong coUed coil-forming potential, and these regions corresponded to 
those that showed local similarities with myosin and IF proteins on 
database searching. In addition, one other putative coiled coil region 
was identified in the central region of APC. The potential for both 
APC and MCC to form coiled coils Is interesting in that such structures 
often mediate homo- and hetero-oUgomerization. 

Finally, it had previously been noted that MCC shared a short 
similarity with the region of the m3 muscarinic acetylcholine receptor 
(mAChR) known to regulate specificity of G-protein coupUng. The 
APC gene also contained a local similarity to the region of the m3 
mAChR that overlapped with the MCC similarity (Figure 4B). Although 
the simUarities to ral2 (Figure 4A) and m3 mAChR (Figure 4B) were not 
statistically significant, they were intriguing in light of previous obser- 
vations relating G-protans to neoplasia. 

Each of the six genes described above was expressed in normal 
colon mucosa, as indicated by their representation in colon cDNA 
libraries. To study expression of the genes in neoplastic colorectal 
epilheUum, we employed reverse transcription-polymerase chain reac- 
tion (PCR) assays. Primers based on the sequences of FER. TBI. TB2. 
MCC. and APC were each used to design primers for PCR performed 
with cDNA templates. Each of these genes was found to be expressed 
in normal colon, in each of ten cell lines derived from colorectal can- 
cers, and in tumor cell Unes derived from lung and bladder tumors. The 
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ten colorectal cancer cell lines included eight from patients with spo- 
radic CRC and two from patients with FAP, 
Example 2 

This example demonstrates a genetic analysis of the role of the 
FER gene in FAP and sporadic colorectal cancers. 

We considered FER as a candidate because of its proximity to 
the FAP locus as judged by physical and genetic criteria (see 
Example 1), and its homology to known tyrosine kinases with oncogenic 
. potential. Primers were designed to PCR-amplif y the complete coding 
sequence of FER from the RNA of two colorectal cancer cell Unes 
derived from FAP patients. cDNA was generated from RNA and used 
as a template for PGR. The primers used were 
5'-AGAAGGATCCCTTGTGCAGTGTGGA-3' and 
5'-GACAGGATCCTGAAGCTGAGTTTG-3\ The underlined nucleotides 
were altered from the true FER sequence to create BamHI sites. The 
cell lines used were JW and Difi, both derived from colorectal cancers 
of FAP patients. (C. Paraskeva, B.G. Buckle, D. Sheer, C.E. Wlgley, 
im, J. Cancer 34, 49 (1984); M.E. Gross et aL. Cancer Res. 51, 1452 
<1991). The resultant 2554 basepair fragments were cloned and 
sequenced in their entirety. The PCR products were cloned in the 
BamHI site of Bluescripi SK (Stratagene) and pools of at least 50 clones 
were sequenced en masse using T7 polymerase, as described in Nigro 
et al., Nature 342. 705 (1989). 

Only a single conservative amino acid change (GTG->CTG, cre- 
ating a val to leu substitution at codon 439) was observed. The region 
surrounding this codon was then amplified from the DNA of individuals 
without FAP and this substitution was found to be a common 
polymorphism, not specifically associated with FAP. Based on these 
results, we considered it unlikely (though still possible) the FER gene 
was responsible for FAP. To amplify the regions surrounding codon 
439, the following primers were used: 5--TCAGAAAGTGCTGAAGAG-3' 
and 5'-GGAATAATTAGGTCTCCAA-3'. PCR products were digested 
with PstI, which yields a 50 bp fragment if codon 439 is leucine, but 26 
and 24 bp fragments if it is valine. The primers used for sequencing 
were chosen from the FER cDNA sequence in Hao et al.. supra . 



wo 92/13103 



-26- 



PCr/US92/00376 



^ example demonstrates the genetic analysis of MCC. TB2. 
SRP and AFC in FAP and sporadic colorectal tumors. Each of these 
«nes is linked and encompassed by contig 3 (see Figure 1). 

several lines of evidence suggested that this contig was of par- 
ticular interest. First, at least three of the four genes in this conng 
were within the deleted region identified in two FAP patients. (See 
Example 5 MES.) Second, allelic deletions of chromosome 5q21 in spo- 
radic cancers appeared to be centered in this region. (Ashton-Rickardt 
et al.. oncogene, in press; and Mild et al.. Japn. J. Cancer Res., In 
press ) Some tumors exhibited loss of proximal RFLP markers (up to 
and potentially including the 5' end of MCC), but no loss of markers 
distal to MCC. Other tumors exhibited loss of markers distal to and 
perhaps including the 3- end of MCC. but no loss of sequences proximal 
to MCC. This suggested either that different ends of MCC were 
affected by loss in all such cases, or alternatively, that two genes (one 
proximal to and perhaps including MCC, the other distal to MCC) were 
separate targets of deletion. Third, clones Irom each of the sw FAP 
region genes were used as probes on Southern blots containing tumor 
DNA from patients with sporadic CRC. Only two examples of somatic 
changes were observed in over 200 tumors studied: a 
rearrangement/deleUon whose centromeric end was located within the 
MCC gene (Kinzler et al.. sufira) and an 800 bp insertion within the 
APC gene between nucleotides 4424 and 5584. Fourth, point mutations 
of MCC were observed in two tumors (Kinzler et al.) suEia strongly 
suggesting that MCC was a target of mutation in at least some sporadic 

colorectal cancers. 

Based on these results, we attempted to search for subtle alter- 
ations of contig 3 genes in patients with FAP. We chose to examine 
MCC and APC, rather than TB2 or SRP. because of the somatic muta- 
tions in MCC and APC noted above. To facilitate the identification of 
subtle alterations, the genomic sequences of MCC and APC exons were 
determined (see Table I), These sequences were used to design primers 
for PCR analysis of constitutional DNA from FAP patients. 
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We first amplified eight exons and surrounding introns of the 
MCC eene In affected individuals from 90 different FAP kindreds. The 
PGR products were analyzed by a ribonuclease (RNase) protein assay. 
In brief, the PGR products were hybridized to in vitro transcribed RNA 
probes representing the normal genomic sequences. The hybrids were 
digested with RKase A, which can cleave at single base pair mis- 
matches within DNA-RNA hybrids, and the cleavage products were 
visualized following denaturing gel electrophoresis. Two separate 
RNase protection analyses were performed for each exon, one with the 
sense and one with the antisense strand. Under these conditions, 
approximately 40% of all mismatches are detectable. Although some 
amino acid variants of MGC were observed in FAP patients, aU such 
variants were found in a small percentage of normal individuals. These 
variants were thus unlikely to be responsible for the inheritance of 
FAP. 

We next examined three exons of the APC gene. The three 
exons examined included those containing nt 822-930. 931-1309, and 
the first 300 nt of the most distal exon (nt 1956-2256). PGR and RNase 
protection analysis were performed as described in Kinzler et al. supra , 
using the primers underlined in Table I. The primers for nt 1955-2256 
were 5»-GCAAATCCTAAGAGAGAACAA-3' and 

5^-GATGGCAAGCTTGAGCGAG-3\ 

In 90 kindreds, \he RNase protection method was used to screen 
for mutations and in an additional 13 kindreds, the PGR products were 
cloned and sequenced to search for mutations not detectable by RNase 
protection. PGR products were cloned into a Bluescript vector modi- 
fied as described in T.A, Holton and M.W. Graham, Nucleic Acids Res. 
19, 1156 (1991). A minimum of 100 clones were pooled and sequenced. 
Five variants were detected among the 103 kindreds analyzed. Gloning 
and subsequent DNA sequencing of the PGR product of patient P21 
indicated a C to T transition in codon 413 that resulted in a change 
from arginine to cysteine. This amino acid variant was not observed in 
any of 200 DNA samples from individuals without FAP. Gloning and 
sequencing of the PGR product from patients P24 and P34, who demon- 
strated the same abnormal RNase protection pattern indicated that 
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both had a C to T transition at codon 301 that "resulted in a change 
from arginine (CGA) to a stop codon (TGA). This change was not 
present in 200 individuals wthout FAP. AS this point mutation resulted 
in the predicted loss of the recognition site for the enzyme Tag 1. 
appropriate PGR products could be digested with Tag I to detect the 
mutation. This allowed us to determine that the stop codon 
co-segregated with disease phenotype in members of the famUy of P24. 
The inheritance of this change in affected members of the pedigree 
provides additional evidence for the importance of the mutation. 

Cloning and seguencing of the PGR product from FAP patient 
P93 Indicated a C to G transversion at codon 279, also resulting in a 
stop codon (change from TGA to TGA). This mutation was not present 
in 200 individuals without FAP. Finally, one additional mutation result- 
ing in a serine (TGA) to stop codon (TGA) at codon 712 was detected in 
a single paUent with FAP (patient P60). 

The five germline mutations identified are summarized in 
Table HA, as weU as four others discussed in Example 9. In addition to 
these germline mutaUons, we identified several somatic mutations of 
MCC and APG in sporadic CRC's. Seventeen MCC exons were exam- 
ined in 90 sporadic colorectal cancers by RNase protection analysis. In 
each case where an abnormal RNase protection pattern was observed, 
the corresponding PGR products were cloned and sequenced. This led 
to the identification of six point mutations (two described previously) 
(Kinzler et al., supra ), each of which was not found in the germline of 
these patients (Table DB). Four of the mutations resulted in amino acid 
substitutions and two resulted in the alteration of spUce site consensus 
elements. Mutations at analogous splice site positions in other genes 
have been shown to alter RNA processing in vivo and in vitro. 

Three exons of A PC were also evaluated in sporadic tumors. 
Sixty tumors were screened by RNase protection, and an additional 98 
tumors were evaluated by sequencing. The exons examined included nt 
822-930, 931-1309, and 1406-1545 (Table I). A total of three mutations 
were identified, each of which proved to be somatic. Tumor T27 con- 
tained a somatic mutation of CGA (arginine) to TGA (stop codon) at 
codon 33. Tumor T135 contained a GT to GC change at a splice donor 
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Site, Tumor T34 contained a 5 bp insertion (CAGCC between codons 
288 and 289) resulting in a stop at codon 291 due to a frameshif t. 

We serendipitously discovered one additional somatic mutation In 
a colorectal cancer. During our attempt to define the sequences and 
splice patterns of the MCC and AFC gene products in colorectal 
epithelial cells, we cloned cDNA from the colorectal cancer cell line 
SW430. The amino acid sequence of the MCC gene from SW480 was 
identical to that previously found in clones from human brain. The 
sequence of APC In SW480 cells, however, differed significantly, in 
that a transition at codon 1338 resulted in a change from glutamine 
(GAG) to a stop codon (TAG). To determine if this mutation was 
somatic, we recovered DNA from archival paraffin blocks of the origi- 
nal surgical specimen (T201) from which the tumor cell Une was 
derived 28 years ago, 

DNA was purified from paraffin sections as described in S.E. 
Goelz, S.R. Hamilton, and B. Vogelstein. Biochem. Biophys. Res. 
Comm. 130. 118 (1985). PGR was performed as described in reference 
24, using the primers 5*-GTTCCAGCAGTGTCACAG-3' and 
5'-GGOAGATTTCGCTCCTGA-3*, A PGR product containing codon 
1338 was amplified from the archival DNA and used to show that the 
stop codon represented a somatic mutation present in the original pri- 
mary tumor and in ceU lines derived from the primary and meiastatlc 
tumor sites, but not from normal tissue of the patient. 

The ten point mutations in the MCC and APC genes so far dis- 
covered in sporadic CRCs are summarized in Table UB. Analysis of the 
number of mutant and wild-type PGR clones obtained from each of 
these tumors showed that in eight of the ten cases, the wild-type 
sequence was present in approximately equal proportions to the 
mutant. This was confirmed by RFLP analysis using flanking markers 
from chromosome 5q which demonstrated that only two of the ten 
tumors (T135 and T201) exhibited an allelic deletion on chromosome 5q. 
These results are consistent with previous observations showing that 
20-40% of sporadic colorectal tumors had allelic deletions of cnromo- 
some 5q, Moreover, these data suggest that mutations of 5q2l genes 
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are not limited to those colorectal tumor, which contain alieUc dele- 
tlons ol this chromosome. 
Example 4 

This example characterizes small, nested deletions in DNA from 

two unrelated FAP patients. 

DNA from 40 FAP patients was screened with cosmlds that had 
been mapped into a region near the APC locus to identify small dele- 
tions or rearrangements. Two of these cosmids, L5.71 and LS.79. 
hybridized with a 1200 to NotI fragment in DMAs from most of the FAP 

patients screened. 

The DNA of one FAP patient. 3214, showed only a 940 kb Nod 
fragment instead of the expected 1200 kb fragment. DNA was ana- 
lyzed from four other members of the patients immediate family; the 
940 kb fragment was present in her.affected mother (47ll). Out not in 
the other, unaffected family members. The mother also carried a nor- 
mal 1200 kb Notl fragment that was transmitted to her two unaffected 
offspring. These observations Indicated that the mutant polyposis 
allele Is on the same chromosome as the 940 kb NotI fragment. A sim- 
ple interpretation is that APC patients 3214 and 4711 each carry a 260 
kb deletion within the APC locus. • 

If a deletion were present, then other enzymes might also be 
expected to produce fragments v.1th altered mobUltles. Hybridization 
of L5.79 to Nrul-digested DNAS from both affected members of the 
family revealed a novel Nrul fragment of 1300 kb. In addition to the 
normal 1200 kb Nrul fragment. Furthermore. Mlul fragments In 
patients 3214 and 4711 also showed an increase in size consistent with 
the deletion of an Mlul site. The two chromosome 5 homologs of 
patient 3214 were segregated in somatic cell hybrid lines; HHW1155 
(deletion hybrid) carried the abnormal homolog and HHW11S9 (normal 
hybrid) carried the normal homolog. 

Because patient 3214 showed only a 940 kb Notl fragment, she 
had not inherited the 1200 kb fragment present in the unaffected 
father's DNA. This observation suggests that he must be heterozygous 
for, and have transmitted, either a deletion of the L5.79 probe region 
or a variant NoU fragment too large to resolve on the gel sj-siem. As 
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expected, the hybrid cell line HHW1159, which carries the paternal 
homoloe, revealed no resolved Not fragment when probed with L5,79, 
However, probing of HHW1159 DNA with L5.79 following digestion with 
other enzymes did reveal restriction fragments, demonstrating the 
presence of DNA homologous to the probe. The father is, therefore, 
interpreted as heterozygous for a polymorphism at the Not! site, with 
one chromosome 5 having a 1200 kb Notl fragment and the other hav- 
ing a fragment too large to resolve consistently on the gel. The latter 
was transmitted to patient 3214. 

When double digests were used to order restriction sites within 
the 1200 kb Not! fragment, LS.71 and L5-79 were both found to lie on a 
550 kb Notl-Nrul fragment and, therefore, on the same side of an Nrul 
site in the 1200 kb Not! fragment. To obtain genomic representation of 
sequences present over the entire 1200 kb Notl fragment, we con- 
structed a library of small-fragment inserts enriched for sequences 
from this fragment. DNA from the somatic cell hybrid HHW141, which 
contains about 40% of chromosome 5, was digested with Notl and 
electrophoresed under pulsed-field gel (PFG) conditions; EcoRI frag- 
ments from the 1200 kb region of this gel were cloned into a phage 
vector. Probe Map30 was isolated from this library. In normal individ- 
uals probe Map30 hybridizes to the 1200 kb Notl fragment and to a 200 
kb Nrul fragment. This latter hybridization places Map30 distal, with 
respect to the locations of L5.71 and L5.79, to the Nrul site of the 550 
kb Notl-Nrul fragment. 

Because Map30 hybridized to the abnormal, 1300 kb Nrul frag- 
ment of patient 3214, the locus defined by Map30 lies outside the 
hypothesized deletion. Furthermore, in normal chromosomes Map30 
identilied a 200 kb Nrul fragment and L5.79 identified a 1200 kb Nrul 
fragment; the hypothesized deletion must, therefore, be removing an 
Nrul site, or sites, lying between Map30 and L5.79, and these two 
probes must flank the hypothesized deletion. A restriction map of the 
genomic region, showing placement of these probes, is shown in 
Figures. 

A Noll digest of DNA from another FAP patient, 3824, was 
probed with L5.79. In addition to the 1200 kb normal Not! fragment, a 
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iragment of approrimateiy llOO kb was observed..conslstent vrith the 
presence of a 100 icD deleUon In one chromosome S. In this case, how- 
ever, digestion with Nrul and Mlul did not reveal abnormal bands, indi- 
cating that if a deletion were present, its boundaries must Ue distal to 
the Nrul and Mlul sites of the fraements identified by L5.7S. Consis- 
tent with this expectation. hybridizaUon of Map30 to DNA from 
patient 3824 identified a 760 kb Mlul fragment in addition to the 
expected 860 kb fragment, supporting the interpretation of a 100 kb 
deletion in this patient. The two chromosome 5 homologs of patient 
3824 were segregated in somaflc ceU hybrid lines; HHWi29i was found 
to carry only the abnormal homolog and HHW1290 only the normal 
homolog. 

That the 860 kb Mlul fragment identified by Map30 Is distinct 
from the 830 kb MM fragment identUied previously by L5.79 was dem- 
onstrated by hybridization of Map30 and L5.79 to a Notl-Mlul double 
digest of DNA from the hybrid ceU (HHWUS9) containing the 
nondelated chromosome 5 homolog of patient 3214. As previously indi- 
cated, this hybrid is Interpreted as missing one of the NotI sites thai 
define the 1200 kb fragment. A 620 kb Notl-Mlul fragment was seen 
with probe L5.79, and an 860 kb fragment was seen «nth Map30. 
Therefore, the 830 kb Mlul fragment recognized by probe L5.79 must 
contain a NoU Site in HHW1159 DNA; because the 860 kb Mlul fragment 
remains intact, it does not carry this Noti site and must be distinct 
from the 830 kb Mlul fragment. 
Example 5 

This example demonstrates the isolation of human sequence 
which span the region deleted in the two unrelated FAP patients char- 
acterized in Example 4. 

A strong prediction of the hypothesis that patients 8214 and 
3824 carry deletions is that some sequences present on normal chromo- 
some 5 homologs would be missing from the hypothesized deletion 
homologs. Therefore, to develop genomic probes that might confirm 
the deletions, as weU as to identify fenes from the region. YAC clones 
from a contig seeded by cosmid L5.79 were localized from a Ubrary 
containing seven haploid human genome equivalents (Albertsen et al.. 
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Proc. Natl. Acad. Sci. U.S.A., Vol. 87, pp. 4256-4260 (1990)) with 
respect to the hypothesized deletions. Three clones, YACs 57B8, 
310D8, and 183H12, were found to overlap the deleted region. 

Importantly, one end of YAC 57B8 (clone ATS 7) was found to lie 
within the paUent 3214 deletion. Inverse poJymerase chain reaction 
(PCR) defined the end sequences of the insert of YAC 57B8. PCR 
primers based on one of these end sequences repeatedly failed lo 
amplify DNA from the somatic cell hybrid (Hinvil35) carrying the 
deleted homolog of patient 3214, but did amplify a product of the 
expected size from the somatic ceU hybrid (HHW1159) carrying the 
normal chromosome 5 homolog. This result supported the interpreta- 
tion that the abnormal restriction fragments found in the DNA of 
patient 3214 result from a deletion. 

Additional support for the hypothesis of deletion in DNA from 
patient 3214 came from subcloned fragments of YAC i83Hl2, which 
spans the region in question, Yll, an EcoRI fragment cloned from 
YAC 183H12, hybridized to the normal, 1200 kb NotI fragment of 
patient 4711, but failed to hybridize to the abnormal, 940 kb NotI frag- 
ment of 4711 or to DNA from deletion cell line HHW1155. This result 
confirmed the deletion in patient 3214. 

Two additional EcoRl fragments from YAC 183H12, YIO and 
Y14, were localized within the patient 3214 deletion by their failure to 
hybridizie to DNA from HHW1155. Probe YlO hybridizes to a ISO kb 
Nrul fragment in normal chromosome 5 homologs. Because the 3214 
deletion creates the 1300 kb Nrul fragment seen with the probes L5.79 
and Map30 that flank the deletion, these Nrul rites and the 150 kb Nrul 
fragment lying between must be deleted in pat*ent 3214. Furthermore, 
probe YlO hybridizes to the same 620 kb Notl-Mlul fragment seen with 
probe L5.79 in normal DNA, indicating lis location as L5.79-proxlmal to 
the deleted Mlul site and placing it benv-een the Mlul site and the 
L5.79-proximal Nrul site. The Mlul site must, therefore, lie between 
the Nrul sites that define the 150 kb Krul fragment (see Figure 5). 

Probe Yll also hybridized to the 150 kb Nrul fragment In the 
normal chromosome 5 homolog, but failed to hybridize to the 620 kb 
Notl-Mlul fragment, placing It L5.79-dlstal to the Mlul site, but 
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proximal to the second Nrul site. Hytridizaiicn to the same (860 kb) 
Mlul fragment as Map30 confirmed the localization of probe YU 

LS.79-<llstal to the Mlul site. 

Probe yi4 was shown to be L5.79-distal to both deleted Nrul 
sites by virtue of its hybridization to the same 200 kb Nrul fragment of 
the normal chromosome 5 seen with Map30. Therefore, the order of 
these EcoRl fragments derived from YAC 183H12 and deleted in 
patient 3214. with respect to L5.79 and Map30. is 
L5,79-Y10-Yll-Y14-Map30. 

The 100 kb deletion of patient 3824 was confirmed by the faUure 
of aberrant restriction fragments in this DNA to hybridize with probe 
Yll. combined with positive hybridizations to probes YlO and/or Y14. 
YIO and Yl4 each hybridized to the 1100 kb NoU fragment of paUent 
3824 as well as to the normal 1200 kb Noti fragment, but Yii hybrid- 
ized to the 1200 kb fragment only. In the Mlul digest, probe Y14 
hybridized to the 860 kb and 760 kb fragments of patient 3824 DNA. but 
probe Yll hybridized only to the 860 kb fragment. We conclude that 
the basis for the alteration in fragment size in DNA from patient 3824 
is, indeed, a deletion. Furthermore, because probes YIO and Y14 are 
missing from the deleted 3214 chromosome, but present on the deleted 
3824 chromosome, and they have been Shown to flank probe Yll. the 
deletion in patient 3824 must be nested within the patient 3214 
deletion. 

Probes YIO, YU, Y14 and Map30 each hybridized to YAC 310D8. 
indicating thai this YAC spanned the patent 3824 deletion and at a 
minimum, most of the 3214 deletion. The YAC characterizations, 
therefore, confirmed the presence of deleUons in the patients and pro- 
vided physical representation of the deleted region. 
pxample 6 

This example demonstrates that the MCC coding sequence maps 
outside of the region deleted In the two FAP patients characterized in 
Example 4. 

An Intriguing FAP candidate gene. MCC. recently was ascer- 
tained with cosmid L5.71 and was shown to have undergone mutation in 
colon carcinomas (Kinzler et al., amra). It was therefore of interest to 
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map this gene with respect to the deletions in APC patients. Hybrid- 
ization of MCC probes with an overlapping series of YAC clones 
extending in either direction from L5.71 showed that the 3' end of MCC 
must be oriented toward the region of the two APC deletions. 

Therefore, two 3' cDNA clones from MCC were mapped with 
respect to the deletions: clone ICI (bp 2378-4181) and clone 7 (bp 
2890-3560). Clone ICI contains sequences from the C-terminal end of 
the open reading frame, which stops at nucleotide 2708, as well as 3' 
untranslated sequence. Clone 7 contains sequence that is entirely 3' to 
the open reading frame. ImportantJy, the entire 3' untranslated 
sequence contained In the cDNA clones consists of a single 2.5 kb exon. 
These two clones were hybridized to DNAs from the YACs spanning the 
PAP region. Clone 7 falls to hybridize to YAC 310D8, although it does 
hybridize to YACs 183H12 and 57B8; the same result was obtained with 
the cDNA ICI. Furthermore, these probes did show hybridization to 
DNAs from both hybrid cell lines (HWW1159 and HWW1155) and the 
lymphoblastoid cell line from patient 3214, confirming their locations 
outside the deleted region. Additional mapping experiments suggested 
that the 3' end of the MCC cDNA clone contig is likely to be located 
more than 45 kb from the deletion of patient 3214 and, therefore, more 
than 100 kb from the deletion of patient 3S24. 
Example 7 

This example identifies three genes within the deleted region of 
chromosome 5 in the two unrelated FAP patients characterized in 
Example 4. 

Genomic clones were used to screen cDNA libraries in three 
separate experiments. One screening was done with a phage clone 
derived from YAC 310D8 known to span the 260 kb deletion of patient 
3214. A large-insert phage library was constructed from this YAC: 
screening with Yll identified X205. which mapped within both dele- 
tions. When clone X205 was used to probe a random-, plus oligo(dT>-, 
primed fetal brain cDNA library (approxijiately 300,000 phage), six 
cDNA clones were isolated and each of them mapped entireiy within 
both deletions. Sequence analysis of these six clones formed a single 
cDNA contig, but did not reveal an extended open reading frame. One 
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or the six cDNAs was used to isolate more cDNA clones, some of which 
crossed the L5.7i-proxlmal breakpoint of the 3824 deletion, as indi- 
cated by hybridization to both chromosome of this patient. These 
Clones also contained an open readinf frame, indicating a transcrip- 
tional orientation proximal to distal with respect to L5.71. This gene 
was named DPI (deleted In polyposis 1). This gene is identical to TB2 

described above. 

CDNA warns yielded a cDNA contie of 3.0-3^ kb, and included 
two Clones containing terminal poly(A) sequences. This size corre- 
sponds to the 3.5 kb band seen by Northern analysis. Sequencing of the 
flKt 3163 bp 01 the CDNA contig revealed an open reading frame 
extending from the first base to nucleotide 631. foUov^ed by a 2.5 kb 3' 
untranslated region. The sequence surrounding the methionine codon 
at base 77 conforms to the Kozak consensus of an initiation methionine 
(Kozak. 1984). Failed attempts to walk farther, coupled with the simi- 
larity of the lengths of isolated cDNA and mRNA. suggested that the 
NH2-terminus of the DPI protein had been reached. Hybridization to a 
combination of genomic and YAC DNAs cut with various enzymes indi- 
cated the genomic coverage of DPI to be approximately 30 kb. 

Two additional probes for the locus. YS-ll and YS-39. which had 
been ascertained by screening of a cDNA Ubrary t*lth an independent 
YAC probe identUled with MCC sequences adjacent to LS.71. were 
mapped into the deletion region. YS-39 was shown to be a cDNA Iden- 
tical in sequence to DPI. Partial characterization of YS-ll had shown 
that 200 bp of DMA sequence at one end was identical to sequence cod- 
ing lor the 19 kd protein of the ribosonial signal recognition particle. 
SRP19 (Lingelbach et aL, su£ra). Hybridization experiments mapped 
YS-ll within both deletions. The sequence of this clone, however, was 
found to be complex. Although 454 bp of the 1032 bp sequence of 
YS-ll were identical to the GenBank entry for the SRP19 gene, 
another 578 bp appended 5' to the SRP19 sequence was found to consist 
of previously unreported sequence containing no extended open reading 
frames. This suggested that YS-ll was eltiier a chimeric clone con- 
taining two independent inserts or a clone of an incompletely processed 
or aberrant message. If YS-ll were a conventional chimeric clone, the 
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independent segments would not be expected to map to the same physi- 
cal region. The segments resulting from anomalous processing of a 
continuous transcript, however, would map to a single chromosomal 
region. 

Inverse PCR with primers specific to the two ends of YS-ll, the 
SRP19 end and the unidentified region, verified that both sequences 
map within the YAC 310D8; therefore. YS-ll is most likely a clone of 
an Immature or anomalous mRNA species. Sutsequently. both ends 
were shown to lie with the deleted region of patient 3824, and YS-Il 
was used to screen for additional cDKA clones. 

Of the 14 cDNA clones selected from the fetal brain library, one 
donet V5, was of particular interest in that It contained an open read- 
ing frame throughout, although it Included only a short Identity to the 
first 78 5* bases of the YS-ll sequence. Following the 78 bp of identi- 
cal sequence, the two cDNA sequences diverged at an AG. Further- 
more, divergence from genomic sequence was also seen after these 78 
bp, suggesting the presence of a splice junction, and supporting the 
view that YS-ll represents an irregular message. 

Starting with V5, successive 5* and 3* walks were performed; the 
resulting cDNA contig consisted of more than 100 clones, which 
defined a new transcript, DP2, Clones walking in the 5' direction 
crossed the 3824 deletion breakpoint farthest from L5.71; since its 3' 
end is closer to this cosmid than its 5' end, the transcriptional orienta- 
tion of DP2 is opposite to that of MCC and DPI. 

The third screening approach relied on hyondizaiion with a 120 
kb Mlul fragment from YAC STBS. This fragment hybridizes with probe 
Yll and completely ipans the ICQ kb deletion in patient 3824. the 
fragment was purified on two preparative PFGs, labeled, and used to 
screen a letal brain cDNA library. A number of cDNA clones previ- 
ously Identified in the development of the DPI and DP2 contigs were 
reascertained. However, 19 new cDNA clones mapped into the patient 
3824 deletion. Analysis indicated that these 19 formed a new contig, 
DPS, containing a large open reading frame. 

A clone from the 5* end of this new cDNA contig hybridized to 
the same EcoRl fragment as the 3' end of DP2. Subsequently, the DP2 



wo 92/13103 



-38- 



PCT/US92/0O376 



and DPS contigs vere connected by a single 5' waDdng step from DP3. 
to form the single contlg DP2.3. The complete nucleotide sequence of 

DP2^ is shown in Figure 9. 

The consensus cDNA sequence of DP2.5 suggests that the entire 
coding sequence of DP2Ji has been obtaJned and is 8532 bp long. The 
most 5- ATG eodon occurs two codons from an in-frame stop and con- 
forms to the Korak initiation consensus (Kozak. Nucl. Acids. Res., 
VOL 12, p. 857-872 1984). The 3' open reading frame breaks down over 
the flnil 1.8 kb. giving multiple stops in all frames, a poly(A) sequence 
was found in one clone approximately 1 kb into the 3- untranslated 
region, associated with a polyadenytetion signal 33 bp upstream (posr 
tlon 9530). The open reading frame is almost Identical to that identi- 
fied as APC above. 

An aliemailveiy spliced exon at nucleotide 934 of the DP2.5 
transcript is of potential interest, it was first discovered by noting 
that two classes of cDNA had been isolated. The more abundant cDNA 
class contains a 303 bp exon not included in the other. The presence in 
vivo of the two transcripts was verified by an exon connection experi- 
ment. Primers flanking the alternatively spliced exon were used to 
amplify, by PCR, cDKA prepared from various adult tissues. Two PCR 
products that differed in size by appro.^nately 300 bases were ampli- 
fied from au the lisues tested; the larger product was always more 
abundant than the smaller. 
Example 8 

This example demonstrates the primers used to Identify subtle 
mutations in DPI, SRP19. and DP2S. 

To obtain DNA sequence adjacent to t .e exons ol the genes DPI. 
DP2.S. and SRP19, sequencing substrate was obtained by Inverse PCR 
amplUication of DKAs from two YACs, 3I0D8 and 183H12, that span 
the deletions. Ligation at low concentration cycliisd the restriction 
enzyme-digested YAC DMAs. Oligonucleotides with sequencing tails, 
designed In inveree orientation at intervals along the cDNAs. primed 
PCR amplification from the cyclized templates. Comparison of these 
DNA sequences with the cDNA sequences placed exon boundaries at 
the divergence points. SRP19 and DPI were each shown to have five 
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exons. DP2.5 consisted of 15 exons. The sequences of the 
oligonucleotides synthesized to provide PGR amplification primers for 
the exons of each of these genes arc listed in Table 111. With the excep- 
tion of exons 1, 3, 4, 9. and 15 of DP2.5 (see below), the primer 
sequences were located in intron sequences flanking the exons. The 5^ 
primer of exon 1 is complementary to the cDNA sequence, but extends 
just into the 5' Kozak consensus sequence for the initiator methionine, 
allowing a survey of the translated sequences. The 5' primer of exon 3 
Is actually in the 5' coding sequences of this exon, as three separate 
intronic primers simply would not amplify. The 5' primer of exon 4 just 
overlaps the 5' end of this exon, and we thus fall to survey the 19 most 
5t bases of this exon. For exon 9, two overlapping primer sets were 
used, such that each had one end within the exon. For exon 15, the 
large 3' exon of DP2.S, overlapping primer pairs were placed along the 
length of the exon; each pair amplified a product of 250-400 bases. 
Example 9 

This example demonstrates the use of single stranded conforma- 
tion polymorphism (SSCP) analysis as described by Orita et al. Proc. 
Natl. Acad. Sci. U.S.A., Vol. 86. pp. 2766-70 <1989) and Genomics. 
Vol. 5, pp. 874-879 (1989) as applied to DPI. SRP19 and DP2.5. 

SSCP analysis Identifies most single- or multiple-base changes in 
DNA fragments up to 400 bases in length. Sequence alterations are 
detected as shifts in electrophoretic mobility of single-stranded DNA 
on nondenaturing acrylamide gels; the two complementary strands of a 
DNA segment usually resolve as two SSCP conformers of distinct 
mobUities. However, if the sample is from an individual heterozygous 
for a base-pair variant v^ithin the amplified segment, often three or 
more bands are seen. In some cases, even the sample from a 
homozygous individual will show multiple bands. Base-pair-change 
variants are Identified by differences in pattern among the DMAs of 
the sample set. 

Exons of the candidate genes were amplified by PGR from the 
DNAs of 61 unrelated FAP patients and a control set of 12 normal indi- 
viduals. The five exons from DPI revealed no unique conformers in the 
FAP patients, although common conformers were observed with exons 



wo 92/13103 



-40- 



PCT/US92/00376 



2 and 3 in some individuals of both affected and control sets, indicating 
the presence of DNA sequence polymorphisms. Ukewise. none of the 
five exons of SRP19 revealed unique conformers in DNA from FAP 
patients in the test panel. 

Testing of exons 1 through 14 and primer sets A through K of 
exon 15 of the DP2.5 gene, however, revealed variant conformers spe- 
dfic to FAP patients in exons 7. 8. 10, 11. and IS. These variants were 
in the unrelated patients 3746. 3460. 3827. 3712, and 3751. respecUvely. 
The PCR-SSCP procedure was repeated for each of these exons in the 
five affected Indlviauals and In an expanded set of 48 normal controls. 
The variant bands were reproducible in the FAP patients but were not 
observed in any of the control DNA samples. Additional variant con- 
formers in exons 11 and 15 of the DP2^ gene were seen; however, each 
of these was found in both the affected and control DNA sets. The five 
sets of conformers unique to the FAP patients were sequenced to 
determine the nucleotide changes responsible for their altered nobUi- 
ties. The normal conformers from the host Individuals were sequenced 
also. Bands were cut from the dried acrylamide gels, and the DNA was 
eluted. PCR amplification of these DNAs provided template for 
sequencing. 

The sequences of the unique conformers from exons 7, 8, 10, and 
11 of DP2.5 revealed dramatic mutations in the DP2.5 gene. The 
sequence of the new mutation creating the exon 7 conf ormer in patient 
3746 was shown to contain a deletion of two adjacent nucleotides, at 
positions 730 and 731 in the cDNA sequence (Figure 7). The normal 
sequence at this spUce Junction is CAGGGTCA (intronic sequence 
underlined), with the intron-exon boundary between the two repetitions 
of AG. The mutant allele in this patient has the sequence CAGCTCA. 
Although this change is at the 5' spUce site, comparison with known 
consensus sequences of spUce junctions would suggest that a functional 
splice junction Is maintained. If this new splice juncUon were func- 
tional, the mutation would introduce a f rameshift that creates a stop 
codon 15 nucleotides downstream. If the new splice junction were not 
functional, messenger processing would be significantly altered. 
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To confirm the 2-base deletion, the PCR product from FAP 
patient 3746 and a control DNA were elecirophoresed on an 
acrylamide-urea denaturing gel, along with the products 01 a sequenc- 
ing reaction. The sample from patient 3746 showed two bands differing 
in size by 2 nucleotides, with the larger band Identical in mobility to 
the control sample: this result was independent confirmation that 
patient 3746 is heterozygous for a 2 bp deletion. 

The unique conformer found in exon 8 of patient 3460 was found 
to carry a C-T transition, at position 904 in the cONA sequence of 
DP2.5 (shown in Figure 7)t which replaced the normal sequence of CGA 
with TGA. This point mutation, when read in frame, results in a stop 
codOQ replacing the normal arginine codon. This single-base change 
had occurred within the context of a CG dimer, a potential hot spot for 
mutation (Barker et al., 1984). 

The conformer unique to FAP patient 3827 in exon 10 was found 
to contain a deletion of one nucleotide (1367. 136S, or 1369) when com- 
pared to the normal sequence found in the other bands on the S5CP gel. 
This deletion, occurring within a set of three Ps, changed the sequence 
from CTTTCA to CTTCA; ihls 1 base Irameshlft creates a downstream 
stop within 30 bases. The PCR product amplified from this patient's 
DNA also was electrophoresed on an acrylamide*urea denaturing gel, 
along with the PCR product from a control DNA and products from a 
sequencing reaciioo. The patient's PCR product showed two bands 
differing by 1 bp in length, with the larger identical in mobility to the 
PCR product from the normal DNA; this result confirmed the presence 
of a 1 bp deletion in patient 3827. 

Sequence analysis of the variant conformer of exon 11 from 
patient 3712 revealed the substitution of a T by a G at position 1500, 
changing the normal tyrosine codon to a stop codon. 

The pair of conformers observed in exon 15 of the DP2.5 gene 
for FAP patient 3751 also was sequenced. These conformers were 
found to carry a nucleotide substitution of C to G at position 5253, the 
third base of a valine codon. No amino acid change resulted from this 
substitution, suggesting that this conformer reflects a genetically silent 
polymorphism. 
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The observation ol distinct inactivating mutations in the DP2.5 
gene in four unrelated patients strongly suggested that DP2.5 is the 
gene Involved In FAP- These muuUons are summarized in Table HA. 

VramplP 10 

This example demonstrates that the mutations identified in the 
DP2.5 (APC) eene segregate with the FAP phenotype. 

Patient 3746, described above as carrying an A PC allele with a 
frameshif t mutation, is an affected offspring of two normal parents. 
Colonoscopy revealed no polyps in either parent nor among the 
patient's three siblings* 

DNA samples from both parents, from the patient's wife, and 
from their three children were examined, SSCP analysis of DNA from 
both of the patient's parents displayed the normal pattern of conf orm- 
crs for cxon 7, as did DNA from the patients's wife and one of his off- 
spring. The two other children, however, displayed the same new con- 
formers as their affected father. Testing of the patient and his parents 
with highly polymorphic VNTR (variable number of tandem repeat) 
markers showed a 99.98% likelihood that they are his biological 
parents. 

These observations confirmed that this novel conformer, known 
to reflect a 2 bp deletion mutation in the DP2.5 gene, appeared sponta- 
neously with FAP in this pedigree and was transmitted to two of the 
children of the affected individual. 
Example II 

This example demonstraTes polymorphisms in the A PC gene 
which appear to be u' related to disease (FAP). 

Sequencing of variant conformers found among controls as well 
as Individuals nlih A PC has revealed the following polymorphisms in 
the APC gene: first, in exon 11, at position 1458, a substitution of T to 
C creating an Rsal restriction site but no amino acid change; and sec- 
ond. In exon 15. at portions 5037 and 5271, substitutions of A to G and 
G to T, respectively, neither resulting in amino acid sut^stitutions. 
These nucleotide polymorphisms in the APC gene sequence may be 
useful for diagnostic purposes. 
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Example 12 

This example shows the structure of the A PC fene. 

The structure of the A PC gene is schematically sho^n in 
Figure S, with flanking intron sequences indicated. 

The continuity of the very large (6 J kb). most 3' exon in DP2.5 
was shown in two ways. First, inverse PCR with primers spanning the 
entire length of this exon revealed no divergence of the cDNA 
sequence from the genomic sequence. Second, PCR amplilication with 
converging primers placed at intervals along the exon generated prod- 
ucts of the same size whether amplified from the originally isolated 
cDNA, cDNA from various tissues, or genomic template. Two forms of 
exon S were found In DP2.5: one is the complete exon; and the other, 
labeled exon 9A, is the result of a splice into the interior of the exon 
that deletes bases 934 to 123G in the mRNA and removes 101 amino 
acids from the predicted protein (see Figure 11 
Example 13 

This example demonstrates the mapping of the FAP deletior-s 
with respect to the APC exons. 

Somatic cell hybrids carrying the segregated chromosomes 5 
from the 100 kb (HHW1291) and 260 kb (HHW1155) deletion patients 
were used to determine the distribution of the APC genes exons across 
the deletions. DNAs from these cell lines were used as template, along 
with genomic DNA from a normal control, for PCR-based amplification 
of the APC exons. 

PCR analysis of the hybrids from the 260 kb deletion of patient 
3214 showed that all but one (exon 1) of the APC exons are removed by 
this deletion. PCR analysis of the somatic ct J hybrid HHW1291, carry- 
ing the chromosome 5 homolog with the 100 kb deletion from patient 
3824, revealed that exons 1 through 9 are present but exons 10 through 
15 are missing. This result placed the deletion breakpoint either 
between exons 9 and 10 or within exon 10. 
Example 14 

This example demonstrates the expression oi alternately spliced 
APC messenger in normal tissues and in cancer cell lines. 
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Tissues that express the APC gene were identified by PCR 
amplification of cDNA made to mRNA with primers located within 
adjacent APC exons. In addition. PCR primers that flank the alterna- 
tively spUced exon 9 were chosen so that the expression pattern of 
both spUce forms could be assessed. All tissue types tested (brain, lung, 
aorta, spleen, heart, kidney, liver, stomach, placenta, and colonic 
mucosa) and cultured ceU lines (lymphoblasts. HL60. and 
choriocarcinoma) expressed both spUce forms of the APC gene. We 
note, however, that expression by lymphocytes normally residing in 
some tissues, including colon, prevents unequivocal assessment of 
expression. The large mRNA, containing the complete exon 9 rather 
than only exon 9A. appears to be the more abundant message. 

Northern analysis of poiy(A>«elected RNA from lymphoblasts 
revealed a single band of approximately 10 kb. consistent with the size 
of the sequenced cDNA. 
Example IS 

This example discusses strucniral features of the APC protein 

predicted from the sequence. 

The cDNA consensus sequence of APC predicts that the longer, 
more abundant form of the message codes for a 2842 or 28444 amino 
acid peptide with a ma^ of 311.8 kd. This predicted APC peptide was 
compared with the current data bases of protein and DNA sequences 
using both Intelligenetics and GCG software packages. No genes with a 
high degree of amino acid sequence similarity were found. Although 
many short (approximately 20 amino acid) regions of sequence simUar- 
ity were uncovered, none was sufficently strong to reveal which, if 
any, might represent functional homology. Interestingly, multiple sirr..- 
lariUes to myosins and keratins did appear. The APC gene also was 
scanned for sequence motifs of known function; although multiple 
glycosylation. phosphorylation, and myristoyiation sites were seen, 
their significance is uncertain. 

Analysis of the APC peptide sequence did identify features 
Important in considering potential protein structure. Hydropathy plots 
(Kyte and Doolittle. J. Mol. Biol. Vol. 157, pp. 105-132 (1982)) indicate 
that the APC protein Is notably hydrophillc. No hydrophobic domains 
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surcesting a signal peptide or a membrane-spannjnj domain were 
found. Analysis of the lirsi lOOO residues indicates that o-helicai rods 
may form (Cohen and Parry. Trends Biochem, Sci. Vol. 77, pp. 245-248 
(1986); there is a scarcity of proline residues and, there are a number of 
regions containing heptad repeats (apolar-X-X-apolar-X-X-X). Inter- 
estingly, in exon 9A, the deleted form of exon 9, two heptad repeat 
regions are reconnected in the proper heptad repeat frame, deleting 
the intervening peptide region. After the first 1000 residues, the high 
proline content of the remainder of the peptide suggests a compact 
rather than a rod-like structure. 

Hie most prominent feature of the second lOOG residues is a 20 
amino acid repeat that is iterated seven times with semiregular spacing 
(Table 4), The intervening sequences tjetween the seven repeat regions 
contained 114. 116, 151, 205. 107, and 58 amino acids, respectively. 
Finally, residues 2200-24000 contain a 200 amino acid basic domain. 
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Ciu xrg Tyr Ser Clu Cl« Clu cin Hi. Clu «u Clu Clu Ar, Pro 

ilSO 1155 



2638 



2606 



2934 



2962 



3030 



3078 



s {ji s; I ij; is; i " " " " 
5s rj Ji: i^* - ;s is jii 

ewi TCC TAT TCT CAA CAT CAT CAA ACT AAC TTT TCC ACT TAT CCT CAA 
§K sir Wt IVt etc A.p A.p 01« S«r Ly. Phe Cy« Ser Tyr Cly cln 
985 990 *'* 

*xe eex CCC CAC CTA CCC CAT AAA ATA CAT AM CCA AAT CAT ATC CAT 
?^ 5S 5S MP AU HI. ty. li. Hi. S.r Al. A.„ Hi. M.t A.p^ 

1000 i"*"' 

CAT AAT CAT CCA CAA CTA CAT ACA CCA ATA AAT TAT ACT CTT AAA TAT 3126 

£p Sly «iu^i^« j;-/- i^So*'' 

TCA CAT CAC CAC TTC AAC TCT CGA ACC CAA ACT CCT TCA CAC AAT CAA 3174 
ISr A«p Clu Cln Leu A.n Ser Cly Ar, Cln Ser Pro S.r "n A.n Clu 
1035 1040 104b 

ACA TGC OCA ACA CCC AAA CAC ATA ATA CAA CAT CAA ATA AAA CAA ACT 3222 
S AU pro Ly. Hi. ne II. Clu A.p Olu 11. Ly. Cln Scr 
iOSO 1*>5S 

OAC CAA ACA CAA TCA ACC AAT CAA ACT ACA ACT TAT CCT CTT TAT ACT 3270 
til ^ Arg Cln Ser Arg A.n cln Ser Thr Thr Tyr Pro V*l Tyr Thr 
1065 lO'O 

c»c «.cc ACT CAT CAT AAA CAC CTC AAC TTC CAA CCA CAT TTT CCA CAC 3316 
til Ser Z aS Ly. Hi. L.u Ly. Phe Cln Pro Hi. Phe Cly Cln^ 
1080 1085 iosw 

CAC CAA TCI CTT TCT CCA TAC ACC TCA CCC CCA CCC AAT CCT TCA CAA 3366 
t^ t^ S. vH ser Pro Tyr Arg Ser Arg Cly Al. A.n Cly S.r Clu 

UOO iJiw 

xex AAT CCA CTC CCT TCT AAT CAT CCA ATT AAT CAA AAT CTA ACC CAC 3414 
tS^ a" vll Cly ser A.n Hi. Cly He A.n Cln A.n V.l S.r Cln 
1115 1120 

TTC TCT CAA CAA CAT CAC TAT CAA CAT CAT AAC CCT ACC AAT TAT 3462 
S?r ZIu Cyl cm Clu A.p A.p Tyr Clu A.p A.p Ly. Pro Thr A.n Tyr 
1130 113- 

ACT CAA CCT TAC TCT GAA CAA CAA CAC CAT CAA CAA CAA CAC ACA CCA 
Clu J 
114S 
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5' ;ii ^ SI - ^ s: s; s: ^ "J 

Thr A«n Tyr s«r 21« ^y*" ii70 ^^^^ 

1160 

CCT ATT CXT TAT «^ 5" 5« ^ S 

^ CAC TC. TTT TCA TTC TCA ACT TCA TCT CCA CAA ACC ACT AAA 
ty* Oln ser Phe See Ph« S«r Ly" ijos 
119S " 

1210 

c« «« c« »t go 2= «T iS S JS IT. S5 

Asn Al* ty* Arg Gin A.n "n i«« ^^35 
122 S 

.CT CCT CAC CCT CAA AAC CCT IS 12 iH 

S.r Cly cm Pre cm J-y« ^* ^•'^^ jjso 1255 
1240 

CAA ACA ATA CAC ACT TAT gT CTA CAA CAT ACT CCA ATA TCT TTT 

XCA ACA TCT ACT TCA TTA TCA TCT TTC TCX TCA CCT CAA CAT CAA ATA 
S«r AC9 Cyt fer Ser Wu Ser Ser Leu s ^^^^ 

CCA TCT AAT CAC ACC ACA CAC CAA CCA CAT TCT CCT J^;; ^ 
Cly Cy« A«n Cln Thr Thr Cln "« Al» *tp ^^^^ 

1305 

1320 ^^^^ 

• ..^^ »f T -PTA TCT TCA CAA TCA CCC ACC CAC AAA CCT 
AGA CTG aC CCT TCT ACT TTA TCT TCA ^.^ 

Arq L«u Cln Cly Ser sc^ i^w 1350 

CTT CAA TTT CCT TCA CCA CCC AAA TCT CCC TCC AAA ACT CCT CCT CAC 
val Clu Phe Pre ser Cly *1* ty« l*' 136S 
13S5 

«; - ^JV. S i5,?JI - - - - iiS 

- ;s IS s ni is; s: - - 



3558 



3606 



36S4 



3702 



3750 



3798 



3846. 



3a94 



3942 



3990 



403& 



4066 



4134 



4182 



4230 



1385 1^^^ 
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IS s js sf sn sr. s :n 

1400 

.__ -rr CCA CAT Acc ccr sca caa acc atc 

ts s \ii ^ s: rr. ». 

^ 1420 



St; Si ^.p is; s: ss ifi 



1465 



CAC CTt CTT CCA CAT OCT CAJ «J| £1 5!^ JS^ C^J 

Cln V4l L€U ?ro Asp ^•/■P itjo 1495 

1480 

_ — .HP^. m^f^ TCA TCC ACC CTC ACT CCT CTC ACC CTC 

?s s: SI IS s i-- l.. ... 



1500 



r^T CAC CCA TTT ATA CAC AAA CAT CTC CAA TTA ACA ATA ATC CCT CCA 
til pro ^ Vy. A.p VjX^Clu ..u Ar, He M«^Pro Pro 



1515 



^^1- »mT cxe AAT CCC AAT CAA ACA CAA TCA GAG CAC CCT AAA 

5n Sn S5 Ain Thr 5.r Olw flln P.. ty. 

1530 "35 

cxx AAC CAA CAO AAA CAC CCA GAA AAA ACT ATT CAT TCT 
Cl« Ly. Clu Al. Clu Ly. Thr ZU A.p Set 
1545 1S50 "55 

i^^p 2i n: 21 12 si? s^J HI- °" 
s: s si s .^s s tji^s;; 

1580 is'w 

v»» xie rr» ccc CAC ACT CCT TCA AAA TTA CCT CCA CCT CTC CCA ACC 
AAA AAC CCA CCC CAC A« 

Ly. Ly. Pro *" jJoo i605 



r-r» AtST CAG CTC CCT CTC TAC AAA CTT CTA CCA TCA CAA AAC ACC 

tTo S« CU Zl ryr Ly. L.. Fro Ser Cln Aon Ar, 

1610 1^^^ 

11; Si sti ij; SI 1:1 s ;s ;i? s 



4216 



4326 



<374 



4422 



4470 



4518 



4S66 



4614 



4662 



47i: 



475ft 



480i 



4854 



4902 



49£0 



1625 
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ATA AAC TTT TCC AC A CCT ACA 
CCC «C TAT TCT CTT CAX C« *CA CCT ATA AA^ 

Arg v.l Tyr cyi v*l cl" ^"^^ uso 

16*0 IMS ^ 

... -cc CCT CCA AAT CAC TTA CCT CCT 
CTA ACT CAT CTA ACA ATC CAA TCC CCT 

Ser l«« *"P 1665 

sj IK i?j 5n - - io- - - ^ 

1690 

^ »« Tcx eg »~ x» g s» s ;ij S5 !s: SI SK 

Vyn Thr $«r Ser VaI Thr . 171s 

1705 

^ XTT AAT TCT CCT ATC CCC AAA CCC 

CXA «T OAT ATT CTT CCA OA* «C ATT AAT 

Clu CXy A«p Il« I'*" ^i?c 1730 
1720 -^^^^ 

175S ^'^'^ 

^ CTA AAA CCT ATA CCA CAA AAT ACT 

AAG AAA AAC AAA CCA ACT TCA CCA CTA AAA 

uyt ly* ty* I'y* * 1775 n«c 

1770 

!2« S « °" " ^ 

Clu Tyr Arg Thr Arg 1795 
1785 

— ^ rie AXC AAA CAT TCA AAS AAA CAO AAT 

m SI S2 - s IS IIS t?; «» 

ItOO 

..^ r»r TTC AAT CAT AAC CTC CCA AAT AAT CAA 

2: ^ Ki; iS: IS - };| - tISo"" 

^ ccx s s 5"p 5S s" s: "5 

A«p AT? V.X ATfl Cly ser Phe AH Fn. p ^^^g 

^ «-T Txe TCT TTT TCA CCA AAT CAT TCT TTC ACT 
CCT ATT CAA CCA ACT CCT TAC TO .TT 

Pro !!• Clu Cly Thr Fro ly* w jg^^ 

1850 ^^^^ 

TCT CT. CX, TTT .JT JX, CXT .XT CTT J.C - ^ SI 

Ser L«u A«p Phe A«? A«p ai? n p r ^^^^ 



4996 



S046 



S094 



5142 



5190 



5338 



5286 



5334 



5382 



5420 



547B 



5526 



5S74 



5622 



5670 
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CKA TTA ACA AAC CCA AAA CAA AAT AAC CAA TCA CAC CCT AAA CTT ACC 
Clu L.U Arg Lyt Alt ty» Clu A.n Ly« Clu S«r Clu AU ly» V4I Thr 
1880 1885 1890 1895 

ACC CAC ACA CAA CTA ACC TCC AAC CAA CAA TCA CCT AAT AAC ACA CAA 
S8r Ht« Thr Clu Leu Thr Ser Ain Cln Cln Ser Al« Aen Lye Thr Cln 
1900 ISOS 191C 

CCT ATT CCA AAC CAC CCA ATA AAT CCA CCT CAC CCT AAA CCC ATA CTT 
AlA Il« AlA Ly« Cln Pro lie A»n Arg Cly Cln Pro Ly» Pro lie Leu 
191S IWO 1925 

CAC AAA CAA TCC ACT TTT CCC CAC TCA TCC AAA CAC ATA CCA CAC ACA 
Cln Lyt cm ser Thr Phe Pro Cln Ser Ser Ly« Asp lie Pro Aep Arg 
^ 1930 1935 1940 

CCC CCA CCA ACT GAT CAA AAC TTA CAC AAT TTT CCT ATT CAA AAT ACT 
ClY Ale Al* Thr A«p Clu ty Leu Gin Aen Phe AIa lie Clu Aen Thr 
' 1945 1950 1955 



CCA CTT TCC TTT TCT CAT AAT TCC TCT CTC ACT TCT CTC ACT CAC ATT 
Pro Vel Cy» Phe Ser Hie A«n Ser Ser Leu Ser Ser Leu Ser Aep lie 

lllo . ^ 1965 1970 1975 

CAC CAA CAA AAC AAC AAT AAA CAA AAT CAA CCT ATC AAA CAC ACT CAC 
Asp Cln Clu A«n Aen A«n Ly» Clu A»n CluPro lie Lye Clu Thr Clu 



1980 



1965 



1990 



CCC CCT CAC TCA CAC CCA CAA CCA ACT AAA CCT CAA CCA TCA CCC TAT 
Pro Pro Aep Ser Cln Cly Clu Pro Ser Lye Pro Cln Al* Scr Cly Tyr 
^ 1995 2000 2005 

CCT CCT AAA TCA TTT CAT CTT CAA CAT ACC CCA CTT TCT TTC TCA ACA 
Ale Pro tye Ser Phe Hie Vel Clu Aep Thr Pro V4I Cye Phe Ser Arg 

2010 2015 2020 

AAC ACT TCT CTC ACT TCT CTT ACT ATT CAC TCT CAA CAT CAC CTC TTC 
Aen Ser Ser Leu Ser Scr Leu ser lit Aep Ser Clu Aep Aep Leu Leu 

2025 2030 2035 

CAC CAA TCT ATA ACC TCC CCA ATC CCA AAA AAC AAA AAC CCT TCA ACA 
Cln Clu cye lie Ser Ser Ala Met Pro Lye Lye Lye Lye Pro Scr Arg 
2040 2045 2050 2055 

CTC AAC CCT CAT AAT CAA AAA CAT ACT CCC ACA AAT ATC CCT CCC ATA 
Leu Lye Cly Aep Aen Clu Lye Hie Scr Pro Arg Aen Met Cly Cly lie 
2060 2065 2070 

TTA CCT CAA GAT CTC ACA CTT CAT TTC AAA CAT A^A CAC ACA CCA CAT 
l^u Cly Clu Aep Leu Thr Leu Aep Leu Lye Aep I e Cln Arg Pro Aep 
2075 2080 2085 

TCA CAA CAT CCT CTA TCC CCT CAT TCA CAA AAT TTT CAT TCC AAA CCT 
ser Clu Hie cly Leu Ser Pro Aep Scr Clu Aen Phe Aep Trp Lye Ala 
2090 2095 2100 



5718 



5766 



5814 



5862 



5910 



5958 



6006 



6054 



6102 



6150 



6198 



6246 



6294 



6342 



ATT CAC CAA CCT CCA AAT TCC ATA CTA ACT ACT TTA CAT CAA CCT CCT 
lie Cln Clu Cly Ala Aen Ser lie Val Ser Scr Leu Hie Cln Ala Ala 
2105 2110 2115 



6390 
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— ,> ^» -rrr A6A CAA CCT TCC TCT CAT TCA CAT TCC ATC 
CCI CCT CCA TCT TTA TCI WA ser 11. 

Al* Ala AI* Cy« L.O «« 2130 ^13 

2120 2azs - . -rr xcA 

2170 ^-^'^ 

^ „x ccx j« «j =» g; »i SI Si E?J w SS 

Clu ser ty« cly 2X95 

«Te TCT CCA CCC ACC ACA ATC ATT 

S S Si iiS^iS ?S S in IS 2. - - 

„T .TT CCX CCJ JJT .CC ™ 52 5S 511 S 

Hi! lie fro ser ser^ j^^j 

- .-w »»r »«• CCA CCC TCC AAA ACC CCT ACT CAA 
AAA AAA OOC CCA CCC CTT AAG ACT CCA CCC . 

tye ty« Cly Pro ?ro t«u tye "r w ^^^^j 

22S0 

2265 ^^^^ 

xcx TTA xcc ccr cxx ccc .a. cxa xcx tec at. cct c« tc. 
ser Clu L«u Ser Pro Ma ^^"^ ^'^^ ^^^q 
2250 2285 

«r/.k rcA TCT xcx CXT TCC XCC CCT TCA XCX 
XCT AXX CCA CCT TCT XGA TCX CCX TCT XCX C 

ser Ly« Ala Pro Scr^Xr? S.r Cly s.r xrg^x p ^^^^ 

, XTX CAC TCT CCT CCC CCX XXC 

CCT GCC CXG CXX CCX TTX ACT XCX CCT XTX CAC l 

Pro Xli Gin cm pro L«u Ser Xr? ^^'^ 2325 
231S ^-''^ 

- »»T CCA XTX XCT CCT CCT XXC XXX TTA TCT 
TCX ATT TCC CCT CCT ACA XAT CCA ATX ACT 
Scr lie ser Pro Gly Arg xan wxj^* ^^^^ 



2300 

C 
I 

231S 
C 

s 

2330 

C« CTT C=. «C ;=J |» TCC CCT JCT .=T CCT TCJ ACT JJC TCC TCJ 

cm Leu Pro Arg Thr S«r J"^® 2355 
234S 



643£ 

6534 
65S2 
6630 
6676 
6726 
6774 
6822 
6670 
6918 
6966 
7014 
7062 
7110 
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2360 2365 

^ XJLA CXX ACA CCT TTA TCC XXC XAT CCC >CT ACT ATT 

CAC AAC 5" ^^ 5^ cw S«r Ly« X.n Alt S«f Scr He 
Gin A«n Leu Thr Ly* Cln Thr ciy ^ r j^g^ 

cxc TCT CCC TCC AAA CCA CTA AAT CAC ATC AAT AAT CCT 
CCA AGA ACT CAC TCT CCC TCC 

pro Ar9 S«r CX« Ser AU ser i.y» y ^^^^ 
2395 ^'•ww 

2410 



-f«rf* XTC AAA CAA CCT CCA ACC CCA ACC TTA ACA ACA AAA 

5S Vr^l ill Si Si XI. pre S.r Pro^Tnr t.« at, xr, ty.^ 

2440 2*** 

nxc CAX TCT OCT TCX TTT CAA TCT CTT TCT CCA TCA TCT AGA CCA 
Su 111 lu I" SI Ir Phe Clu s.r Leu S.r Pre Ser Ser Arg^Pre 



3460 



— fff xcc TCC CAC CCA CAA ACT CCA CTT TTA ACT CCT TCC 
iVr \frP llr Sn AU Gl.^Thr Pre V»: te« Ser^Pro S.r 

^r-r CXT TCC TCT CTT CAC CCT CCT CCA 

S S S5 S S S S ;S SS IS v.: «; «. oi, 

,r_/- rr» XXX trrc CCA CCT AAT CTC ACT CCC ACT ATA CAC TAT AAT SAT 
?^ S A.n^L.u S.r Pro T^r Ue^cl- Tyr A.n A.p 

r^v xex CCA CCA AAG CCC CAT GAT ATT CCA CCC TCT CAT TCT CAA ACT 
P" AU Ar, Hi. A.F no AU AT, S.r Hi. Ser cl„ S.r 

2520 2S25 

- IS is; s 5S s SI 12 Hj5 - - iE"'- 



2540 



^^r kk* CAT TCA TCA TCC CTT CCT CCA CTA AOC ACT TCC ACA ACA ACT 
Sil ITr ITr Ser Uu Pro Ar|^V*l ser Thr Trp Ar9^xrg Tnr 
2 S S S 

.-^ ™ -CA XTT CTT TCT CCT TCA TCA CAA TCC ACT CAA AAA 
'^11 IT. Ill S« iH "u S.r AU ser Ser Clu S.r^S.r Cl- Ly. 
2570 ^^^^ 

r>&r eXT CAA AAA CAT CTC AAC TCT ATT TCA CCA ACC AAA 
III ^ s" III 1% C^^l t^^Hi. V.1 M„ ser U.^S.r CXy Thr .y. 



7158 



7206 



7254 



7302 



7350 



7398 



7446 



7494 



7542 



7590 



7638 



76S6 



7734 



7782 



7830 



2585 
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- »»» eXA CTA TCC CCA JUa CGh ACX TCC XCA AAA ATA 

^ S: ^ ^ 5It *« XX. .y. CXy^Th. trp .r, ty. U.^ 

2600 2^**^ 

«» °" r I?* ?s SI t:i JS s; 

X,y« CXu Ain Glu Phe 5er rrv * ^^^^ 2^30 

- s s is^S'ss 5fl - - *" 

2650 

CCC ATT AAC AAT CCT AOJ T« C«A ACX TCT CCC ACX CCT AAT ACT 
cy. pro II. x.n A.n Pro *r9 S»r "y xrg » ^^^^ 
2665 

s s vi! s si5 ;s IS IK tjs ii: s Jts ?n 

2680 2665 

. rr» XXX CAA AAT CTC OCT AAT C6C AOT-CTT 

CAT TCA AAA CAT AAT CAC 6CA AAA «A AXX « ^ 

A.P S«r l.y- Asp J-npCln j'oS 2'10 



2715 



5n 2j K ?5 SK k: vs. SI ij; s ^ ^ 

2730 ^^^^ 

5s ;if IS ;it 12 it^p - - ;i - - - 

2745 2750 

XCC CCA TTC ACT TCT A« A«C TCA AOC AAA CAC ACT TCX OCT ACT CCO 

Thr pro Pha Scr Ser Ser S«r S«r 5«r i.yp ^^^^ 
2760 2765 

2780 ^/vs 

5;? s?;;? s ?s IS ;S S5 s is s is ;s 



2795 



* »fcr IAS CCA CAT TCC AAA ACT OAC AOC ACA 

fS i^IS JIS 5S S5 1 SJ =.r ... "' - 

. ^rr CCA ACC CAA ACT CCT AAS CCC CAT TCT CCC TCT TAC CTT 

ser lly 5" ^ Pro Ly. Ar, Hx. Ser^CIy Ser Tyr L.u 



7«7fi 



7926 



7974 



8022 



8070 



8116 



6166 



8214 



6262 



8310 



83S& 



8406 



84S4 



8SO; 



esse 



2825 



5602 

6663 
8722 
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CTG ACA TCT CTT TAAAACACXC CXACXATCAA ACTXACXAAA TTCTATCtTA 

Val Thr S»r V*l 
2640 

ATTACAACTG CTATXTACAC ATTTTCTTTC AAATGAAACT TTAAAACACT CAAAAATTTT 
CTAAATAGCT TTCATTCTTC TTACACWTT TTTCTTCTCG AACCCATATT TCATACTATA 

CTTTCTCTTC ACTCCTCTTA TTTTCCCACC CACTCTTCAT CCTTACCAAA AAATACAAAC 6782 

CCAACTXTCT TTCTACACTA TGTTITACAT CTATTTAAAC TACCATCCCA TCCCAACTTC 6842 

CTXAATTATT CCTTGTCTAA XXTAATCAAC ACTACACATA CCAAATATCA TATATTCCTC 8902 

TTATCAATCA TTTCTACATT ATAAACTCAC TAAACTTACA TCAOCCCAAA ATTCCTATTT 8962 

ATGCAAAAAA AAAATCTTTT TCTCCTTCTC ACTCCATCTA ACATCATAAT TAATCATCTC 9022 

CCTCTCAAAT TCACAGTAAT ATCCTTCCCG ATCAACAACT TTACCCACCC TGCTTTGCTT 9082 

ACTCCATGAA TCAAACTGAT OGXTCAATTT CACAACTAAT CATTAACACT TATCTCGTCA 9142 

CATCATGTCC ATAGAGATAG CTACAGTCTA ATAATTTACA CTA77TTCTC CTCCAAACAA 9202 

AACAAAAATC TCTGTAACTG TAAAACATTG AATCAAACTA TTTTACCTGA ACTACATTTT 9262 

XTCTCAAACT AGCTA6AATT TTTCCTATCC TCTAATTTCT TCTATATTCt CGTATTTCAG 9322 

CTCACATGGC TCCTCTTTAT TAATCACACA TGAArrCTCT CTCAACAGAA ACTAAATGAA 9382 

CATTTCAGAA TAAATTATTC CTCTATGTAA ACTGTTACTC AAATTGCTAT TTCTTTGAAC 9442 

CGTTTGTTTC ACATTTCTAT TAATTAATTC TTTAAAATGC CTCTTTTAAA ACCTTATATA 9502 

AATTTTTTCT TCAOCTTCTA TCCATTAACA GTAAAATTCC TCTTACTCTA ATAAAAACAT 9562 

TCAACAAGAC TCTTCCCACT TAACCATTCC ATCCCTTCGC ACTT 9606 

(2) INFORKATION FOR SZQ ID N0;2t 

(i) SCQUENCr CHAXACTERISTICS; 

(A) LENGTH; 2843 «fliifio acid. 
(8) TYPE: aaino dcid 
(D) TOPOLOGY: linttr 

(ii) KOLECULE TYPEt prot«in 

(Xl) SEQUEHCE DiSCRlPTIONt SEQ ID NOt2: 

Met A14 Al* Ala Smz Tyr Ae? CXn Leu Leu Ly. Gin Val Glu AX* Leu 

Lye Met Clu Aen Ser A.n Leu Xrg Gin Clu Leu Clu Aip Aen Ser A.n 

20 2* 

HiB Leu Thr Lye Leu Glu Thr Clu Al» Ser A.n Mer lyn Clu Val Leu 
35 40 4:> 

LvB Gin Leu Gin Cly Ser He Clu Aep Clu Al* «et Ala Ser Ser Gly 

- 50 55 60 
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Oln lie MP L.U L.U OU Arg I^" «-/• ^« ao 



65 " 



Aan L«u Aap Ser Car 

80 

„. ,„ «, v.i - '5; '51 

s.r « CU .1, S.. VU s.r 5.. «r 01, c;. c,. »r 

^ 100 

V.1 ,ro H.. Olr se. pro Arg Ar, Oly PK. V.I A.n Cly S.r A., 

Olu sec Thr Cly Tyr «o Clu Leu Clu Ly. Cl« Xr, S.r Leu 

130 

Al. A.P A.P .y. Clu clo ly. Clu Ly. A.P Trp tyr Tyr Al. 

145 "° 



cm I-u cm A.n ^« Thr Ly. Ar, 11. A.p .« t.u Pro I^u Th. Clu 

X.n Ph. ser ^« Cln Thr A.p t.u Tjr Ar, A., Cln L-u Clu Tyr Clu 

180 

X.. AT. cm XI. Arg v.1 Al. Me. Clu Clu Cln L.u Cly Thr Cy. Cln 

195 

rin Arcs Arq He Al* ^rg He Cln Cln Il« 
MP Met Clu Ly« xr9 Al* Cln Arg Arg ^^v 

210 ^-^^ 

Cl„ Ly. A.P 11. L.U Xrg He Ar, Cln teu teu Clr. S.r Cln Al. Tnr 

225 

Cl« Al. CIO AT, ser ser Cln A.n ty. Hi. cl« Thr Cly *cr Hi. A.p 

245 

^1.. f-i« ftiv VaI Glv Clu !!• A«n Met Al* 
XU Clu Arg Cln A.n Clu Cly cln Cly vai Giy 



260 

^1 »o« fiW Cln Cly S«r Thr Thr Arg Met Atp Hi» Clu Thr 
Thr S«r Cly A»n Cly Cin <iiy » 

275 

S.r V.1 X.U S.r ser ser ser Thr Hi. Ser Al. Pro Arg Arg L.u 

290 

Thr ser Hi. l^u Cly Thr Ly. V.l clu He. v.1 Tyr S.r L.u Leu S.r 

305 

Her L.U Gly Thr Hi. A.p I.y. A-P '^•P ^^5 

sar ser ser Cln A.p s.r cy. Xl. s.r Het Ar, Cln s.r Cly Cy. 

340 -^^^ 
pro Leu Leu lie Cln teu .e. Hi. Cly A.n A.p Ly. «p Ser V.l 

355 

* civ Ser tv» Clu Al* Arg Ala Arg Ala Ser 
Leu L«u Cly A.n Smr Arg Cly s.r i.y. 

370 

M. Ala Leu Hi. A.n II. He Hi. Scr Cln Pro A.p A.p Lys Arc Cly 

365 
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*r9 xr, CX« n. Ar, V.X fu HI. L.« l-u Clu Cln II. Ar, M. Tyr 



405 



Cy. Clu Thr Cy. Trp Clu Trp Cln Clu AU Hi. Clu Pro Cly H.t A.p 
Cln A.P Ly. A.n ?ro Ket Pro Al. Fro V.l Clu Hi. Cln II. Cy. Pro 



43S 

V.l cy. v.l L«« Met Ly. Ser Phe A.p Clu Clu Hi. Arg Hi. 
450 «5 



Al. V.I cy. v«i Lmv net ^i- -•r 



XI. A.n Ol« L.U Cly Cly "n Al. II. AX. clu l^u I,.u Cln 
465 

V.1 A.p cy. Clu M.t lyr Cly Thr A.n A.p Hi. Tyr s.r II. Thr 

Leu Ar, Arg Tyr Al. Cly M.t Al. t.u Thr A.« L.u Thr Ph. Cly A.p 

5QQ 505 

v.l Al. A.n Ly. Al. Thr L.u Cy. *.r M.t Ly. Cly Cyi M.t Arg Al. 

5J5 520 

L.U V.1 Al. Cln leu Ly. S.r Clu S.r Clu Aap L.u Cln Cln v.l II. 

530 S35 5«o 

Al. S.r V.l teu Arg A.n L.u s.r Trp Arg Al. A.p v.l A.n S.r Ly. 

545 550 

Ly. Thr L.u Arg Clu V.l Cly S.r V.l Ly. Al. L.u M.t Clu Cy. Al. 

565 570 s'> 

L.U Clu v.l Ly. Ly. 61" Thr Leu Lye Ser V.l Leu Ser Al* Lc« 

see 

Trp A.n L.U S.r Al. Hi. Cy. Thr Clu A.n Ly. Al. A.p U. Cy. Ai. 

595 600 

V.1 A«p Cly Al« Leu Ala Phe Leu V.l Cly Thr Leu Thr T>r Arg S.r 
610 "° 

Cln Thr A.n Thr L.u Al. II. He Clu Ser Cly Cly Cly II. L.u Arg 
625 630 "S 6«0 

A.n V*l S.r S«r Leu II. Al. Thr A.n Clu A.p Hi. Arg Cln lie L.u 
645 650 

Arg Clu A.n A.n Cy. L.U Cln Thr L.u L«u Cln Hi. L.u Ly. S.r Hi. 

660 

Ser L.O Thr lie V.l Ser A.n Al. Cyf Cly Thr l.u Trp Air. L.u S.r 

675 6«= 

Al. Arg A.n Pro Ly. A.p Cin Clu Al. t.c irp A.? M.-. Cly Al. v.l 

690 695 700 

ser Met Leu Ly. A.n Leu lie Hi. S.r Ly. Hi. Lyf M.t He Al. Met 

705 '1° 

Cly Ser Al. Al. Al. Leu Arg A.n L.u M.t Ai* A.n Arg Pro Al. Ly. 
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WOW/13103 .e:- 

• 1 fi*r pro Civ S«r $*r L«u Tto Ser Leu 

Tyr I.y« X.p AU A«n lie Met Ser fro y 
' 740 

t rm Lv« AU Leu Clu Al» Clu Leu A.p Alt Cln Hie 

Hi« V»l Arg Ly« Cln Ly« Ai* J^eu _ 
7S5 



l^u ser Clu Thr Phe Aep Atn 



11* Af? A.n Leu Ser Pro Ly» AU Ser 

78C 

„U XT, ser .y. Oin Arg Hi. Ly. Cln .e. Tyr aly A-P Tyr v.x 

..1 x.^ xmn A>n fcrfl ser A«p Aan the Am Tnr 
Ph. Aop Thr A.n Arj Hil A.p W *•« ^ 815 



770 "5 



805 



Cly A.n H.t Thr V*l L.« Ser Pro Tyr Leu 



820 



Am Thr Thr V*l Leu Pro 
830 



ser ser Ser S.r S.r Ar, CXy Ser ..u A.p Ser Ser Ar, Ser CXu Ly. 

83S 

„p « ..r 0>" tIS 

8S0 

Pro AU Thr Gl» A.h Pro Gly Thr Ser Ser Ly. Ar, Oly x^u Cln He 

865 

ser »hr Thr AU AU Cln He AI. Ly. V.l Met Clu Clu v.l S.r Al. 

665 

lie Hi. Thr ser Cln clu A.p Ar, S« s.r Cly Ser Thr Thr Clu L.u 

900 

Hi. cy. V.1 Thr A.P Clu Ar, A.n Al. Leu Ar, Ar, Ser Ser Ala AU 

915 "° 

Hi. Thr Kl. ser A.n Thr Tyr A.n Ph. Thr Ly. Ser Clu A.n Ser A.n 

930 

AT, Thr cy. ser Met Pro Tyr AU Ly. Leu clu Tyr Ly. Ar, Ser Ser 
A.P ser L.U A.n S.r V.l Ser S.r A.n A.p Cly Tyr Cly Ly. Ar, 

Cly cm Ly. Pro S.r II. Clu S.r Tyr S.r Clu A.p A.p Clu S.r 

.y. Pne cy. ser Tyr Cly Cln Tyr^.rc AU A.p L.u AU^H.. Ly. U« 

H.. ser Al. A.n Hi. Met A.p A.p A-" ^-P "y CU^Leu A.p Thr Pr= 

1010 ^^^'^ 
lie A.n Tyr Ser Leu Ly.^Tyr Ser A.p 61- Cln Leu A.n Ser Cly Ar|^ 
1025 1030 

Cln S.r Pro Ser Cln A.n Clu Ar, Trp AU Ar, Pre Lys Hi. Ile^H. 

1045 xwjv 

Clu A.p Clu lU Ly. Cln S.r Clu Cln Ar; Cln Ser Ar, Aan^Cln Scr 

106C ^^"^ 
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Thr Thr Tyr Pro VaI Tyr Thr Clu s«r Thr Awp Atp Lyt Hi» Lmu tyt 
J07S 1080 1085 

Phe Cln Pro His Phc Cly Gift CXn Clu Cyt v«l scr Pro Tyr Arg Ser 

1090 1095 1100 . _ 

Xrg Cly AU Aen Cly scr clu Thr ABn Arg Val cly Ser Asn His Cly 
1105 1110 1115 1120 

He Aan Cln Aon VaI Ser Cin Ser Leu Cy» Cln GU A«p A»p Tyr Clu 
1125 1130 11^5 

Asp Asp Lyi Pro Thr Asn Tyr Ser Clu Arg Tyr ser Clu Clu Clu Cln 
1140 IKS 1150 

His Clu Clu Clu Clu Arg Pro Thr Asn Tyr Ser lie Lys Tyr Asn Clii 
1155 1160 1165 

Clu Lys Arg His val Asp Cln Pro He Asp Tyr Ser Leu Lys Tyr Al« 

1170 1175 llfiO 

Thr Asp lie Pro ser S«r cm Lys Cln Ser Phc Ser Phe Scr Lys Ser 
lies 1190 1195 1200 

ser Ser Cly Cln Ser Ser Lys Thr Clu His Met Ser ser Ser Ser Clu 
1205 1210 1215 

Asn Thr ser Thr Fro Ser ser Asn Ala Lys Arg Gin A«n Cln Leu His 

1220 1225 1230 

Pro Scr Ser Ala Cln Ser Arg Ser Cly Cln Pro Cln Lys Ala Ala Thr 
1235 1240 1245 

Cys Lys Val Ser Ser He Asn Oln Clu Thr He Cln Thr Tyr Cys Vtl 
1250 1255 1260 

Clu Asp Thr Pro He Cys Phc Ser Arg Cys Ser Ser Leu Scr Ser Leu 
1265 1270 1275 1280 

Ser Ser Al« Clu Asp Clu He Cly Cys Aen Cln Thr Thr Cln Clu Ala 
1285 1290 1295 

Asp Ser AlA Asn Thr Leu Cln lis Ala Clu He Lys Cly Lys He Cly 
1300 1305 1310 

Thr Arg Ser Ala Clu Asp Pro Val Ser Clu VaI Pro Ala Val Scr Cln 
1315 1320 1325 

His Pro Arg Thr Lys Ser Scr Arg Lou Cln Cly Ser s«r Leu Ser Scr 
1330 1335 1340 

Clu Ser AlA Arg His Lys Ala Val Clu Phe Pro Ssr cly Ala Lys Ser 

1345 1350 13SS 136C 

Pro Ser Lys Ser Cly Ala Cln Thr Pro Lys Ser Pro Pro Clu His Tyr 
1365 1370 1375 

Val Gin Clu Thr Pro Leu Met Ph« Ser Arg Cye Thr Ser Val Ser Ser 
1380 1385 139C 

Leu Asp Scr Phe Clu Scr Arg scr He Ala Scr Ser Val Gin Ser Clu 
13?5 14C0 140= 
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.ro Cy. S.r Cly Met V.l S.r Gly tU II. S.r Frc^ST *.p L.« Fro 
X.p^Ser pro Ciy Cln TJr^Het Fro Pro s.r Ar|^s.r Ly.-Thr .ro Pro^ 



.1 «-h- Lv« kra Glu val Pro Ly« X«n Ly9 

Pro Pro Pro cm Thr AU Cln Ly» ^9 ^455 
1445 1^*° 

pro Thr Xl.^CXu Ly. Ar, CXu S.r^Cly, Pr= Ly. Cln Al.^Al. V.i 



Mn AU V.I Cln AT, V.1 Cln V.1 X.u Pro ..p Aj.^X.p Thr I.. 
1475 1**^ 
HX..Ph. Al. Thr Glu Jer^Thr Pro A.p Cly Phe^Ser Cy. S.r Ser 



1490 



, , T»ti xao Clu Pro Phe lie Cln Lym A«P V*l 

ser !.•« S.r *U Leu " xsis 1520 



1S05 1"° 
Cl« t.u AT, lie M«^Pro Pro Val Cln Clu^A-n A.p A.« Cly A.n^Clu 

Thr Clu S.r Clu_Cln Pro Ly. Clu S« A.n Clu A.n Cln Clu^ty. Cl« 



.1. Clu I.y. Thr II. A.P S.r Clu Ly. A.p Leu Le. A.p^A.p S.r A.p 

isss 1^^° 

*sp ..p x.p n. Clu II. L.« Clu Clu cy. 11. Ue^ser AU K.t Pro 

1570 15^^ 

T»,r Ly. 3er S.r Xr, ^y. Cly Ly. Ly. Pro Al.^Cln Thr Al. S.r Ly.^ 



1585 



1590 



leu pro pro pro Val^AU Ar, Ly. Pro S.z^Cln I-u Pro V.l Tyr^Ly. 

Uu Pro S.r Cln A.n Ar, Leu Cln Pro Cln Ly, His v.l^Ser Phe 
1620 xc-i3 
Thr pro Cly^A.p A.p «et Pro Arg^val Tyr Cy, Val Clu^Cly Thr Pro 

lie A.n Ph. S.r Thr Al. Thr Ser L.u S.r A.p L^u^Thr He Clu Ser 

1650 1**^ 
Pro^Pro A.n CJ« L.u U.^AU Cly Clu Cly v.l^Ar, Cly Cly Al. Cln^ 



ser Cly Clu Phe Clu Lys A.p Thr Ue^Pro Thr Clu Cly Ar,^S.r 

1635 

, ^>r. riv civ LVB rr.r Ser Ser Val Thr lie Pro Clu 

Thr Acp Clu Xla Cln Gly Cl^ L> 8 --r c» 

1700 ^^^^ 

Tr,« A^ft CI',; Clu Civ Asp Il« Leu Ala Clu Cye He 
Leu A«p Asp Asn Lys A.a Ci. wiu r 

1715 -^2-' 
S.r_Ala Met Pro Lye Cly^Ly. Ser Mi. Ly. Pro^Phc Ar, Val Ly. 



1730 
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Ly. lio H«t A.p cm Val^Cln Cln Xl. S« K1.^5er S«r St XU Pro^ 



1745 



Uy, A.n Cln Leu A.p cly Ly. Ly. Ly.^Ly. Pro Thr Ser Pre^V.l 



176S 



tyc pro lie Pro Cln M.n Ihr OU Tyr Xr, Thr Ar, V*l Ar, ly. A.n 

' X780 ^ 

Al. A.p ser Ly. A.n A.n L.u A.n AU Clu Arg V.l Ph. *.r A.p A.n 

X795 

tyt A«p S«r Ly« Ly» ain X«n L«u Ly» Xtn X«n Str Lyi X*p Phe Xin 

IfllO 1®^^ ^'^^ 

A.p Ly. Leu Pro A.n A.n Clu A.p Ar, V4I Ar, Cly S« Ph. Al. Phe^ 

X825 1830 ^^^^ 

A.p «.r Pre Hi. Hi. Tyr Thr Pro lie Clu 61y Thr Pre Tyr CY" P»»e 
X84S liSO l»»> 

Ser Arg A.n A.p Ser U« Ser Ser l«« /ep Ph. A.p A.p A»P *»P V4l 

i860 1865 J.e7u 

ASP teu Smr Xrg Clu Ly. XU Cl« L«u Xr? Ly. Al. Lyj Cl« X.n ty. 
1875 1960 1885 

Clu S«r Clu Xl4 Ly» Vml Thr Ser Kit Thr Clu L«u Thr Ser Xen Cln 

1890 1895 1900 

Cln Ser XU Xen Lyfl Thr Cln Xla lie XU Ly. Cln Pro He A.n Xrg 

1905 1910 1915 1920 

Cly Cln Pro Lye Pro He L.u cln Ly. Cln Ser Thr Phe Pro Cln ser 
1925 1930 1933 

ser Lye Aep 1^J^^^«> ^•P 1945^^* ^^"^ ^'^ 1950^*" 

X8n Phe XI* He Clu X8n Thr Pro Vel Cy. Phe ser Hi«/«'^ Ser Ser 
IfSS 1960 1965 

Leu Ser Ser Leu Ser Xep He Xip Cln Clu Xen X«a Xen Lyi Clu X*n 

1970 1975 1980 

Clu Pro He Lye Clu Thr Clu Pro Pro Xip Ser Gin Cly Clu Pro Ser 
1985 1990 1995 . 200C 

Lye Pro Cln Xl. Ser Cly Tyr XI* Pro Ly. Ser Phe Hi. V.l Clu X.p 

^ 2005 2010 2015 

Thr Pro Vel Cy. Phe Ser Xr; X.n Ser Ser Leu Ser Ser Leu Ser He 
2020 2025 2030 - 

XBP ser Clu X.p xsp Leu Leu Cln Clu Cy. He Scr Ser Xla Met Pre 
^ 2035 2040 2045 

Lv. Ly. Ly. Ly. Pro Ser Xry Leu Ly. Cly Xsp X»n Clu Lyi Hi. Ser 

^ 2050 2055 2060 

pro xrg Xsn «et Cly Cly He Leu Cly Clu A.p Leu Thr Leu Xep Leu 
2065 2070 2015 2080 
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... n. «. »' nW' 

Tit cm Clu ClY Ala Ain Sex n« VaX 
Clu A«n Ph« A.p Trp Ly« AU U* ^J^^^^ jno 
2100 

.„ .„ -i. «» »• 'AW' 

2US ^^^^ 

.„ ..P .« "r IIU"' 

2130 2135 
cxy S.r .ro Ph. Hi. xnr Pro Mp Cln Clu^C.u ty. Pro TJr^ 
2145 2^^° 

*.n .r. cxy SHo"^ 

CX„ Thr Lr. ly. U. Ser Clu Ser I-r- IX. CJy^C., Ly. 
21S0 

Ti. Thr Civ Lvfl Vil Arg S«r Asn Scr Clu 
Ly. Val Tyr Ly« ser L«u lie Thr Cly tys ^^^^ 
2155 2200 

^1 ri« «.t: Lvi cm Pro Leu Gin AU Atn M«t Pro Ser lie 
lie Ser Cly Cm Met Lyi 2220 
2210 22i:> 

u:« Ti» Pro Clv Vel Arg A»n Ser Ser 
ser Arg Cly Arg Thr H« He Hx* He Pre Ciy^ 9 ^^^^ 

2225 

c*r i,v« LV0 Oly Pro Pro Leu Lyi Thr Pro 
Ser ser Thr Ser ^^^^^^^ ^ 2250 22S5 

r « . iser Clu Cly cm Thr Ale Thr Thr Ser Pro Arg 

Ala ser Lyc Ser Pro Ser ciu v^y * ^270 
2260 

Cly .Xa .ys^Pro Sex VaX Ly. Ser^PXu Le« Ser Pro V.l^XX. Ar, CXn 

inr S.r cm ne Cly Cly Ser s.r Ly. *1. Pro s.r Xx, Ser Cly Ser 



2290 



2300 



MP ser Thr Pro S.r Ar, Pro *X. Cln Cln^Pro L.u S.r Ax, Prc^ 



23bS 33X0 
XI. Cln S.r Pro Cly^Ar, A.n S.r U- S.r^Pro Cly Ar, A.n Cly^Il. 

ser pro Pro A.n Ly. I..- Ser Cln L.. Pro Ar, Thr Ser S« Pro S.r 

2340 

.pw T«-i ser Cly ser Cly Lyi Met Ser T^r Thr Ser 
Thr Ala Ser Thr Lye Ser ser wi> * / 
23S5 2360 

, . r^r. M.tL ier Cln Cln Asn Leu Thr Lye Clr. Thr Cly Leu 
pro Gly AC5 Cln Met Ser cin o ^^^^ 

2370 237b 
ser Lys A.n Al. Ser S.r lie Pro Xr, Ser Clu^Scr AU S.r Ly. Cly^ 
2385 2390 

..n Cln Het A.n Mn Cly A,n Cly AU^A.n Ly. Ly. v.l Clu^Leo 
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smz Arg not s^r S«r Thr Ly» 5er Ser Cly 5er Clu 5«r X.p Xrg Ser 
2420 242S 2O0 

Clu Xrg Pro V4I L«u v*l Xrg Cln S«r Thr Ph« !!• Lyg clu Al« Pro 
2435 2440 2445 

Str Pro Thr Leu Arg Arg tye Leu Clu Clu 5«r Ser Phe Clu Ser 
2450 2455 246C 

Leu Ser Pro Ser Ser Arg Pro Ala Ser Pro Thr Arg Ser Cln Al« Cln 
2465 2470 2475 2480 

Thr Pro V*l Leu Ser Pro S«r Leu Pro Asp Met Ser Leu Ser Thr Hi» 
2485 2490 2495 

Ser Ser Vel Cln Al* Cly Gly Trp Arg Lys Leu Pro Pro Aan Leo Ser 
2500 2S05 2510 

Pro Thr He Clu Tyr Aen Asp Cly Arg Pro Als Lys Arg His Asp lie 
2515 2520 2525 

Ale Arg Ser His Ser Clu Ser Pro Ser Arg Leu Pro He Aan Arg Ser 

2530 2535 2540 

Cly Thr Trp Ly« Arg Clu Hif Ser Lys His Ser Ser Ser Leu Pro Arg 
2545 2550 2555 2560 

val Ser Thr Trp Arg Arg Thr Cly Ser Ser Ser Ser He Leu Ser Ala 
2565 2570 2575 

Ser Ser Clu Ser Ser Clu Lya Al* Lye Ser Clu A«p Clu Lyi Hia Vel 
2580 2585 2590 

Asn ser Il« S«r Cly Thr Lys Cln sor Lyt Clu Asn Cln vaI Ser Als 
2595 2600 2605 

Lye Cly Thr Trp Arg Lys He Ly» Clu Aan Clu Phe Ser Pro Thr Aan 
2610 2615 2620 

Ser Thr Ser Cln Thr V4I Ser Ser Cly Ai4 Thr Asn Gly Ala Clu Ser 
2625 2630 2635 2640 

Lys Thr Leu He Tyr Cln Met Ala Pro Ala Val sar Lyi Thr Clu Asp 
2645 2650 2655 

VAl Trp V«l Arg He Clu As? Cy« Pro He Aan Asn Pro Arg Ser Cly 
2660 2665 267C 

Arc Ser Pro Thr Cly Aan Thr Pro Pro Vil He ktp Ser Val Scr Clu 
2675 2680 26S5 

Lys Ale Asn Pre Aan He Lya Aep Ser Lye Aep Asn Cln Ala Lya Cln 

2690 269S 27C0 

Asn Val Gly Asn Cly Ser VaI Pro Met. Arg Thr Val Gly Leu Clu Aan 
2705 2710 2715 272C 

Arg Leu Thr Ser Phe He Gin V4I Aep Ala Pro hsp Cln Lye Gly Thr 
2725 2730 2735 

Clu He Lye Pro Cly Cln Aan Acn Pro Val Pro V*l Ser Clu Thr Aan 
2740 274& 2750 
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„ u.i clu Arc Tnr Fro fhe S«r S«r S«r s«r S*r s«r 
CXu Ser pro 11* v.l Clu Arg ^ 
275S * 

s.r CIV Ihr Val Al* Alt Arg VaI Thr Pro Ph. 
tym Hit S-r P*^" ,ii5 2780 
• 2770 - 

... „. «. «" JiJs"' '"0 

27B5 ^ 

< cln lie pre Thr Pro v*l A.n A.n A.n thr ty. ty. Arg 
xrg pre Ser cln ij*^^'^* jgio 2B15 

. eiu Ser Ser Cly Thr Cin Ser Pro Lye 

A.p ser Ly. Ihr^AiP Ser Ihr clu Ser^Ser ^^^^ 

XT, Hi. S.r Cly S.r Tyr Leu V.l Thr S.r V.l 

2S35 

(2) inrORKATION FOR SEQ ID NO: 3: 

fil SEQOEKCC CHARACTERISTICS: 

(A) LEHCTK: 3172 baec pair. 
8 TYPE: nuclalc acid 
(C) STRANDEDKESS: doubl« 
CD) TOPOLOGY: linCAZ 

(ii) MOLECULE WE: cONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo eapienc 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: D?i(TB2) 

(ix) FEATURE: 

^ (A) NAHE/KEy: CDS 

fB) LOCATION: 1..630 

(xi) SEQUEhXE DESCRIPTION: SEC ID NOrS: 

;s ss s s ^? 5S IT. s: st? S!? S5 

20 ^* 

^ »Tr iCT CAC CTT CTG GCC AAC CTC CAC 

s 1^ J?: - -H " '« 

35 

»rr ICC nc ATC OCT CTT CCT CTC ATC CGA 

Si S5 5S SiS ;S - ... M. ^jj v., n. 0.. 

r^/- CTC TTC CCT TAT CCA CCC TCT CTC CTC TCC 
CTC GTG CCC TTC TAC CTC CTC TTC CCT T 
Leu Val Ala Lau Tyr L«u Val Phe Cly *yr 



4fi 



96 



144 



192 



240 
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»»f. f-rt: XIX GCX TTT CCC TXC CCX CCC TAC *TC TCA ATT AAA CCT ATA 

ni Tir ^ ryr fro ax. tyr n. u. ty. "* 
as 



268 



336 



364 



432 



ABO 



526 



ii; 5S - - iH - - - -1 " 
^ s| s ;n Jii s?i s s tn 
IS ?s s in ss ij: 17. s|j 1% 

^^/^ ^r^ xcc ecT TCT AAT CCC CCT CXA CTC CTC TAC AAC CCC 

tVo in IV. Jin ay XU t.- v.. tyr ty. Xr, 
J45 ISO 

in t?5 s ?n z K ss ij; gs sf ;s 5n 

55 S5 S S| SfS 5S 

IBO 

AAA CAA CCC AAC AAA CCT ACC CTC AXT TTA CTC CCT CAA CAA AAC AAC 
^ Clu AU Ly. Ly. Al. Thr Val A.n L.u L«u Cly Clu Clu Ly. Ly. 

195 200 

ACC ACC TAAACCACAC TAAACCACAC TCCATCCAAA CTTCCTCCCC TCTCTCTACC 

Ser Thr 

210 

T7CCTACTCC ACCTTCATCT TATATTACCC ACTCTCCTAT AATTATTTTA ATAATC7TCC 
C7TGCAAACA T7TTTCACAT ATTAAACATT CCAATCTCTT CTAACTTTCT TTGCTTACTT 
TTACTGTCTA TATATATACC CACCACTTTA AACTTAATCC AGTGGGCAGT CTCCACGTTT 
TTCCAAAATG TATTTTCCCT CTGGCTAGGA AAACAICTAT CTTCCTATCC TCCACCAAAT 
ATAXACTTAA AATAAAATTA TATACCCCAC ACCCTCTCTA CTTTACTCGG CTCTCCCTCC 
ACGSATTTTC TCTGTACTTA CATTTAGCRT AATCTTTATG GTTCTACTTC CTRTAATCTA 
CAATTTTATA TAATTCH.^KA A7GTTTTTAA TCTATTTGTC CACATCTACA TATGGAAATG 
TTACTCTCTC ACTACAKCAT CCATCATCCT CATCCGGAGC CACCACCCCA ACCTTCTATC 
TGTCATTTAT AACTTCTCTA CACTAAGACC ACCTCCCAAA AGCTGCAGGA ACCATTCTCC 
TCCTCTGCTC TACTAAATAA TACTTTACCA AATACCTGAT TAATATGCAA CTGAACAAAG 
TCAGAAATGA AATCGAATGG ACATTGCCC7 GC7TCTTTCC C7AGTATATC CCATATCAA7 
ACCAGGATAC C7TTATAAAC CAC7TAG7TA CT7AG77AC7 CAC7C7ACTC ATAAATCGCC 
AAATTTACAC ACACACACAC ACACACACAC ACACACXCAC .C.CACACAC ACACACACAG 



624 
660 

740 
800 
660 
920 
980 
1040 
1100 
1160 
1220 
1260 
1340 
1400 
14&0 
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KaXACCCTCT IU.CTCTCAXT TCCCTC.AA. ACTXCTMTA CTCTCTTXTC 
™«T.tT TCTCXXTTCT CXXaXTCCTX CXHTCCXK«C CATTTCTCCT 
,,OSC«XCXH ACXTCrrCXT TT.«TCTTCT TTCCCATCT TCTTTTTTM 
««,CTTCTC RACmtCVC CACCTCTO.T TACATCTATC TTCrrCTTTC 
^CAACATCC TAATCRCCAC ACCTACCTCT -KAaMCCAATT CTCCCACANT 
^*K>.TMHC CCATAATCTC CTTCCCAATA CITAACTCAA TCIATCTTCA 
CCCCTXTAA^ CTCAAACACA ACACCCTTCC CTACTTTACA ACXCAOAC^TC 
CATTTAAXXC CCCTCATCCC TATTCTXraT CTTOATAACC .CCACAK.AC 
CTXCAOAHCA CTAAACTTXA HHCOOATCTC TCCAT«ATC TOCCAAKTCC 
CA*TT«TCT C«.CTACAAA ATCTCAGTTT TACACCATAC TCTTAAaACT 
TAAACTACAC TAXAACAACT CTATAACTAA ACTAACAACA TTAAATATCC 
CTATtttTTA «CCAAAtAA ACATCATTAG CTCACCTTCA CHXAACAATC 
XTKACAATGT CICATGATGT HAANAATATT AAAGATAICA ATACTAAGTC 
KHCTAATATA ATMCCAtCA CACCATTTAT TTTCCCCA« AAAACAGTCG 
OCTTTTATIA AACTTAAAAC TTXCTAGAAA CCAAACAAAA TTGXTCrTCO 
ACTTTTAGAT TAAAAAAATT TTAAGTAWCT ACCAGTATTT AAATCCTTTT 
XAAGTACAOT TTTCTT«TG GCACAATOAA AATCACCAAC KTCTACCATA 
AXTCAGATTG ACAGCATATA GAATATATTX rCACACXXCX «X«XGGTX 
TXTTGCTCAT XXTGXCTTXC XCGCIAXXAH TAGMIKTAXA ATXC7AXXTT 
TCCAATTTTT TTTXGTTCCC TTGACACCXX XAXTtXXGTT AACTGTTGCT 
CTGTXXXTGT tAACACCXGC XCXXGTIXAO AATTCAGCXG TTCTGTTGCA 
^XTGAAATAC TGCCXXCCCX AGAOXXWAX AAACXAATTC ACCCXX;XGCC 
XCAAGOGXTX AXTXGAXXGX GAATAGTCTX XCAAXGGXXT GTXCXTXCXC 
^CXOCTTX AAXXCXTCXX CXXXCAATTC CXCCAGCXGT TXTXCCCTXA 
XCAXXCAXXX GGATCAACAA CTCCTXCXCX CGGGXXCXC CCICTXCXCX 
^TGAGCXC ACCCtXCACA CTCXXXXCXC CTXTCCIGXA GXTCTGXXXC 
XATXXXXXGX TGXAAXXAAX ATXGAGWICX CXXXXXXXXX XXXCCXTGTC 
;U^XXGXCCXC ATGTXCXCCT TTXXXCXCCX XC7CCACC=A CTXGC.C.CC 
XWCCXATAXX IXCXXCICXC TCXCXCCCCC TC 

(2) INFOUKXTIOW rOR StQ ID NO: 6: 



TCCTATAAAC 


1S20 


XTTATCTTCA 


1S80 


KCCACTTXNA 


1640 


TATCATKACC 


1700 


CAWCCNWCT 


1760 


C7TTTTCXCT 


1£20 


ACTTCTACTC 


isao 


TACATACTAA 


1940 


KTATACACAG 


2000 


CCTTTTCAAT 


2060 


ACCCAOTACA 


2X30 


ACCTAAGATC 


2180 


ACACTATCAC 


2240 


TCATTACCCC 


2300 


GAGAAAATCA 


2360 


CCCATAAATA 


2420 


TACACrATAT 


2480 


CAAAAC7TAC 


2S40 


AXATTCTCAA 


2600 


CCCACTCTAA 


2660 


TCATTTCCCA 


2720 


T0CCTA5AAA 


2760 


AATTCCTACC 


2840 


CCTCAACCCT 


2900 


CACCTGAACA 


2960 


ACTGAATCCA 


3020 


CCCAATCCCA 


3060 


CCATTCACCT 


3140 




3172 
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ii) SEQUENCE CHARACTERISTICS: 

(A) LEKCTH: 210 Amino acids 

(B) TYPE: trtino 4Cid 
(D) TOPOLOCri linctr 

(ii) MOLECULE TYPE; protein 

ixi) SEQUENCE DESCRIPTION*: SZQ ID NO:4: 

Ala Val Ala Ala Pro Val Tyr Pro Ala Leu Cly Thr Ala Pro Cly Ciy 
1 5 10 15 

CXu Thr Val Pro Ala Met scr Ala AXa Met Ar9 Clu Arg Phe Aip Azg 
20 25 30 

Phe Leu Hta Clu Lyt Aen cyo Her Thr Asp Leu Leu Ala Lye Leu Clu 
35 40 45 

Ala Lys Thr Cly Val A»n Arg Ser Phe lie Ala Leu Cly Val He Cly 
SO 55 60 

L«u val Ala Leu Tyr Leu val Phe cly Tyr cly Ala Ser Leu Leu Cyi 

65 70 75 80 

Aen Leu lie Cly Phe Cly Tyr Pro Ala Tyr He Ser He Lye Ala He 
85 90 95 

Clu Ser Pro Aen Lye Clu Aep Aep Thr Gin Trp Leu Thr Tyr Trp Val 
100 105 HO 

Val Tyr Cly Val Phe Ser He Ala Clu Phe Phe S«r Aep He Phe Leu 

H5 120 125 

Ser Trp Phe Pro Phe Tyr Tyr Met Leu Lye Cye Cly Phe Leu Leu Trp 
130 135 140 

Cys Met Ala Pro Ser Pro Ser Asn Cly Ala Clu Leu Leu Tyr Lye Arg 

145 ISO 155 160 

He He Arg Pro Phe Phe Leu Lyt Hit Cl*j Ser Cln Her Aap 5er Val 

165 17C 175 

Val Lye Aep Leu Lye Asp Lys Ser Lys Clu Thr Ala Asp Ala He Thr 

160 185 190 

Lye Clu Ala Lys Lye Ala Thr Val Aan Leu Leu Cly Clu Clu Lys Lys 

195 200 205 

Ser Thr 
210 

(2) INFORMATION TOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH I 434 an\ino acids 

(B) TTPE: amino acid 

(C) STRANDEDKESSi single 

(D) TOPOLOGY: linear 

HL) MOLECULE TYPE: protein 
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(vi) ORICIHM. SOURCE I 

(X) ORCAMISK: Homo ■»pten« 

(Vll) IMMEDlATt SOURCE: 
(B) CLONE! TBI 

(Xi) SEQUZNCE DESCRIPTION: SEO ID HO:5: 

V*l pro val VU V*l Gly Scr Ciy Arg *U Pre Ar, Hi. Pro Ala 



Pro iU* Al. H.t Hi. Pro AT, ^rg Pro A.p cly Ph. A.p 01, Leu Ciy 
Tyr Xr, Cly Ily Ar, A.p CXw Cin Ciy Ph. ciy ciy Ai. Ph, Pro 
Ai. AT, S.r Ph. 5er Thr Ciy S.r A.p t.u ciy Hi. Trp V.i thr Thr 
pro p!o A.P lie pro ciy Ser Ar, A.n L.u Hi. Trp Ciy Ciu Ly. s.r 
"o pro Tyr Gly V*i Pro Thr Thr S.r Thr Pro Tyr Ciu Ciy Pro Thr 



as 90 



Ciu Ciu pro Ph. ser S.r Ciy Ciy Cly Ciy Ser Vai Cin Cly Cln Ser 



100 



ser Ciu Cln Lev A.n Ar, Ph. Ai. Cly Phe Ciy 11. Cly i..« S 



Leu Ph. Thr Ciu Acn V*i Leu Ala Hie Pro Cy. U. V.i I^u Arg ^r, 
cm cy. Cln val A.« Tyr Hi. Ai. Cln Hi. Tyr Hi. Leu Thr Pro Ph. 



145 



ISO 



Thr V.1 lie A.« li. H.t Tyr S.r Phe A.n Ly. Thr Cln Ciy Pro Ar, 

16S 

Ai. L.U Trp Ly. Ciy Met cly S.r Thr Phe 11. Val Cln Cly v.l Thr 



195 



180 

Leu Cly AU Ciu Cly II. He Ser Ciu Ph. Thr Pro L.u Pro Ar, Ciu 



val L.U Hi. Ly. Trp Ser Pro Ly. Cln 11. Cly Ciu Hi. L.u Leu L.u 



210 



Ly. ser L.u Thr Tyr Val Vai Ai. M.t Pro Ph. Tyr Ser AI. Ser L.u 



225 



II. Ciu Thr val Cln Ser Ciu U. U. Ar, A.p A.n Thr Cly lie L.u 



245 



Ciu cy. vai Ly. Ciu Cly lie Cly Ar| Val He Cly Met Cly V.i Pro 



260 2" 
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Hli S€r Ly. Arg l^u Leu Tzo L«u Uu Ur L.u IX. Fh. Pre Thr V^l 

275 280 2ib 

L*u Hit CXy v*l Leu Hi. Tyr lie lie Ser Ser V*l He Cln Ly. Phe 

290 295 300- 

V4l Leu Leu He Leu Ly. Arg Lyt Thr Tyr X.n Ser Hi. Leu KU Clu 

30S no 315 320 

ser Thr Ser Pro Val Cln Ser Met Leu Aap XI. Tyr Phe Pro Clu Leu 
325 330 335 

II. All A.n Phe XI. XI. ser Leu Cy. Ser Xap V.l He Tyr Pro 

340 34S 3*0 

Leu Clu Thr V.l Leu Hi. Xr? Leu Hit He Cln Cly Thr Arg Thr lie 
355 360 365 

He A.p A.n Thr A.p Leu Cly Tyr Clu V.l Leu Pre He A.n Thr Cln 

370 375 380 

Tyr Clu Cly Met Arg Asp Cy. He A.n Thr He Arg Cln Clu Clu Cly 
385 390 395 400 

V.l Phe Cly Phe Tyr Ly. Cly Phe Cly Al» V.l He He Cln Tyr Thr 
405 410 415 

Leu Hi. Al. XI. V.l Leu Cln He Thr Ly. He He Tyr ser Thr Leu 

420 *25 430 

Leu Cln 

(2) INFORMATION FOR SKft IE> NOi6: 

(i) SEQUENCr CHARACTERISTICS: 

|A) LENGTH: 165 mino *cid» 
(5) TYPE: Mino acid 

(C) STRAKDEDNESSj Bingl. 
O) TOPOLOCy: linear 

(ii) MOLECULE TYPE* protein 

<vi) ORIGINAL SOURCE I 

<X> ORGANISM t Homo eApien* 

(vil) IMKEOIATE SOURCE: 

(D) CLOKEi V6-39(T82) 



(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 6: 

Clu Leu Arg Arg Phe A.p Arg Phe Leu Hii Clu Ly. Asn Cys Met Thr 

a 5 10 IS 

A.p Leu Leu Al» Lye Leu Clu XU Lyt Thr Cly Val A.n Arg Ser Phe 
20 25 30 

IXe Ala Leu Cly Val lie Cly Leu Val Xla Leu Tyr Leu V.l Phe Cly 
35 ^0 45 
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ryr Cly XX. S« t.u i^u Cy. X.n Uu IX. Cly Ph. Cly Tyr Pro Ala 

All 

70 



ryr II. S.r n. Ly. AU II. Clu s.r Pro *.n ty. Cl« A.p A.p Thr 
6S 70 
Cln Trp L.U Thr Tyr Trp V.l V.X Tyr ojy v.l P.. S.r II. JI. Cl« 

,Ke Ph. S.r X.P n. Ph. t.u S.r Trp Phe Pro Pn. Tyr Tyr He ..u 

xoo 

« rv« M«t Ala Pro Ser Pro Scr Aan Cly 

I,y. Cyf CXy Ph. t.u L«g Trp Cyf M.t Axa rr 

* ^ TV-. AM lie lit Arg Pro Phe Ph. L.U tya Hi. 
Ala Clu L.U L.U Tyr Ly. M li. -i^" 

130 

CX« S.r Gin Met A.p S.r Val V.I ty. te« Ly. A.p ty. AU ty. 

145 "0 

CI« Thr AU A.P AU lU Thr ty. Clu AU ty. ty. AU Thr v.l A.h 

165 

Leu L.U Gly Clu Clu Lyi Lyf S.r Thr 

XBO 

(2) INFORMATION FOR SEQ IP KO:7: 

(i) SZQVftUCt CHARACTERISTICS: 

(A) LTNCTHi 2«42 Miino acida 

(B) TTPEl woino .cid 

IC) 5TRAN0EDKKSI iift^l* 
(0) TOPOLOC^tJ ltn«ar 

(li) MOLECULE TYPEi prot.in 

fvi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo Bapianfl 

fvii) IMMEDIATE SOURCE: 

(B) CLONE: APC 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0t7: 

Met AU AU AU S.r T>-r A.p Cln Leu Leu LyQ Cln V*l Clu AU Leu 

Ly. Met Clu A.n S.r Asn Leu Arg Cln Clu L.u Clu A.p A.n Ser Aen 

Hi. l.u Thr Ly. teu Clu Thr Clu AU S.r A.n xet ty. Clu V*l teu 

j5 40 

Ly. CU L.U Cln Cly Ser lie Clu Asp Clu Ala Met AU Ser Ser Cly 



Cln He A.p L.u Leu 
65 



Clu Ar9 Leu Ly« Clu Leu Asn Leu Aap Ser Sc 



70 



7S flO 
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A.n Ph. pro Cly v.l Ly. Leu Ar, S«r ly. M«t S.r f « at, S.r Tyr 

Cly s.r XT, Cl« Cly s.r val S.r «.r xr, s.r Cly Olu Cy- S.r Pro 

100 _ 
val Fro «et Cly Ser Phe Pro Ar9 Xr? Cly PK. v»l A.n Cly Ser Xrg 



115 



120 125 



Clu S«r Thr Cly Tyr L«u Cl« Clu L«u Clw tyi Clu Arg Ser Leu Leu 
130 i35 140 

L«u AU A«p L«« A.p Ly* Clu Clu Lyt Clu Lyi Asp Trp Tyr Tyr Al* 

145 150 155 160 

GXn L#u cm A.n Leu Thr Ly* Ar9 XI* A.p Ser Lmu Lau Thr Clu A.n 

1€S 170 175 

Pha ser Leu Cln Thr Aap Met Thr Arg Ar? Cln Leu Clu Tyr Clu Ale 
180 190 

Arg Cln lie Ar? Vel Ala Met Clu Clu Cln Leu Cly Thr Cye Cln Aep 
19S 200 205 

Met Clu Ly» Arg Ale Cln Arg Arg He Ale Arg lie Cln Cln He Clu 
210 21S 220 

Ly» Aep II* l^u Arg lie Arg Cln Leu Leu Cln Ser Cln Ale Thr Clu 

225 230 235 240 

Ale Clu Arg Ser Ser Cln Aan Lyt Kit Clu Thr Cly Ser Kli Aip Al» 
24S 250 2S5 

Clu Arc Cln Afn Clu Gly Cln Cly Vel Cly Clu He A«n Met Ala Thr 
260 265 270 

Ser Cly Atn Cly Cln Cly Ser Thr Thr Arg Met Aep Hie Clu Thr Ale 
275 280 285 

Ser VaI Leu Ser Ser Ser Ser Thr Bit Ser Ala Pro Arg Arg Leu Thr 
290 295 300 

Ser Hie Leu Cly Thr Lye V4I Clu Met Val Tyr Scr Leu Leu ser Met 

305 310 315 320 

Leu Cly Thr Hie A«p Lys Aip Aep Met Ser Arg Thr Leu Leu Ala Met 

325 330 33S 

Ser ser Ser Cln Asp Scr Cyi He Ser Met Arg Cln ser cly Cye Leu 

340 345 350 

Pro Leu Leu He Cln Leu Leu Hie Cly A«n Aip Ly» Aep Ser v«l Leu 

355 360 365 

Leu Cly Aen Ser Arg Oly Scr Lye Clu Ala Arg Ala Arg Al» Ser Al» 

370 ^75 360 

Ala Leu Hia Aen He He His Ser Cln Fro Aep Aep Lya Arg Cly Arg 

385 390 395 40C 

Arg Clu He Arg Val Leu Hie Leu Leu Clu Cln He Arg Al« T/r Cye 
405 410 415 
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CXu Thr Cy. irp Cl« Trp Cln Clu JU Hi. cxu Pro cly Kec A.p Cl« 
420 

r ^ X. . M«t Pro AU Pro V*l CXu His Cln !!• Cy. Pro Alt 
X«p Lyt Am Pro Met rro a* 

435 

, ^ V 1 Met tyt 

VAl Cy» V«l Lau Met i.y* 

450 

. « /-I* ti« XlA Clu Leu Leu Gin V*l 
Met Aen Clu Leu Cly Cly Leu Cln Ala lie Al* ciu 

465 

„p cy. Clu H.t Tyr Cly L.. thr Mn A.P Tyr Ser II. Thr Leu 
^ xr, Xyr .1. Cly «et Al. L.u Thr A.n ..u Thr PH. Cly X.p v.l 

.1. Ly. L Thr X.U cy. S.r Met ty. Cly Cy. H« Ar, AU L.« 

515 

V.1 Ala «« ty. «1« JJS 

530 ^-^^ 
s.r v*l L.U AT, A.n I.U 5er trp Ar, AU A.p v.l A.n Ser Ly. Ly. 

545 

Thr Le« AT, Cl« V.1 Cly Ser V*l Ly. Al. Leu Met Olu cy. Al. L.« 

565 

-iM ^mr Thr Leu Ly» Ser VaI Leu Ser Ala Leu Trp 
Clu Val Lyi Ly« clu Ser inr teu 
560 

. »i. Hifl eve Thr Clu A.n Ly« Ala Aep He Cy« Al* V*l 
XBQ Ser Alt Hi9 Cye xnr j 

S95 

Aep Cly Ala Leu Ala Ph. Leu Val Cly Thr Leu Thr Tyr Ary S.r Cln 
610 

Xnr Asn Thr Leu Ala II. He Olu Ser Cly Cly Cly He Leu Arg A.n 

6:s "° 

v.1 ser ser L.u He AU Thr A.n Clu A.p Hi. Arg Cln He L.u Ar, 

645 

Clu A.n A.n cy. L.u Cln Thr Leu Leu Cln HI. Leu Ly. S.r Hi. Ser 

660 

X..U Thr lie V.1 S.r A.n AU Cy. Cly Thr L.u Trp A.n L.u Ser AU 

675 

A.n rr= ty. A.p cln Clu AU L.u Trp A.P Het Cly AU Val Ser 

690 

He. Leu Ly. A.n L.u lU Ki. Ser Ly. HU Ly. He. He AU Met Cly 

705 

S.r AU Al. Al. L.U Ar, A.n Leu Met AU A.n Ar, Pro AU Ly. Tyr 
l.y, A,p AU A.n lie Ket Ser pre Cly ser ser Le„ Pro Ser Leu BU 
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V.1 xr, ;,y. cm .y. *X. I-" ^lu Xl. Clu X-u X.p Xl. Cl« Hi. X..U 

755 

x-« xan !!• A«P 

S«r Clu Thr Ph« Atp *' 

770 

xr, Ser .y. Ci„ xr, Hi. .y. CXn Ser L.u Tyr Cly X.p Tyr VU PJc 

785 

..p TKr X.n xr, Hi. X.p X.p X.n Kr, Ser X.p X.n Phe X.n T^r Cly 

X.n Met TKr v.; L.u S.r Pro Tyr t.u X.n Thr thr V.l J.« Pro S.r 

420 825 

S.r S.r S.r Ser Xr, cly S.r tew X.p S.r Ser Xr, Ser «« Ly. X.p 



835 



X,, ser L.U Clu xr, Clu Xr, Cly II. Cly L.u Cly A.B Tyr Hi. Pro 

855 



8S0 



M» Thr Clu x.« Pre Cly Thr S.r S.r ty. Xr, Cly teu Cln lie S|t 
Thr Thr XI. XI. Cln II. XI. ty. V.l M.t Clu Clu V.l ST Al. II. 



8B5 



Hit 



Thr s.r cm Clu x.p xr, Ser Ser Cly Ser Thr Thr clu Hi. 

900 



Cy. V.l Thr X.p Clu xr, X.n XU Leu Xr, Xr, S.r S.r XI. Al. Hi. 

' 515 920 

Thr Hi. ser X.n thr Tyr X.n Ph. Thr Ly. S.r Clu X.n S.r X.n Xrg 

930 '35 

Thr Cy. ser Met Pro Tyr XI. Lys Leu Clu Tyr Ly. Xr, Ser S.r X.n 

945 950 

A.p s.r L.U x.n S.r V*l S.r ser ser x.p Cly Tyr cly Ly. xr| Gly 

Cln Met Ly. Pro Ser II. Clu S.r Tyr Ser Clu x.p x.p Clu Ser Ly. 
980 

,ne cy. S.r Tyr Cly Cln Tyr Pro^Xl. X.p L.u XI. Hi.^Ly. II. Hi. 



1010 



S.r Xl.^X.n Hi. «.t X.p X.p^X.n X.p Cly Clu Leu^X.p Thr Pro He 

1025 1030 103S 1040 

ser Pre Ser Cln x.n clu xr, Trp XU Xr,^Pro ly. Hi. II. Il.^Clu 



x.n tyr S.r L.u Ly. Tyr S.r X.p Clu Cln L.u^x.n Ser Cly Arg Cln^ 



104S 



1060 



A.p Clu lie Ly.^Cln Ser Clu Cln Xr,^Oln S.r Xr, X.n Cln^Ser Thr 

Ser : 

toeo 



Thr Tyr Pro^V.l Tyr Thr clu S.r^lhr x.p X.p Ly. Hi.^Leu Ly. Phc 
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cm pro Hi. Pn. cxy cm Cl« cxu cy. v.l Pro^Tyr ^r, s.r at, 
1090 



Cl« Thr Aen Ar, v.l Gly Ser A.« Hi. Cly lie 



Cly Al» Mn Cly Ser "u xnr adr --^ --.^ _ ^y^O 
1105 

X.n Ci« Mn V.I Ser.Oln Scr L.u cy. cjn^clu A.P X.p Tyr CXu^A.p 



1125 

..p Ly. Pre T,. A.n Tyr Ser Ciu Ar, Tyr ser Clu Clu Clu^Cln His 
1140 

CI„ Olu Ciu Clu XT, pro Thr A.« Tyr s.r tU Ly. Tyr^A.n Clu Cl« 

1155 

Hi. val A.P cin pro n. *.P Tyr Ser L.«^ty. Tyr AU Thr 
1X70 ^^^^ 
..p XI. pro s.r *.r 0X« Ly. cin s.r Ph. S.r^Ph. Ser Ly. S.r S.r^ 

1185 ^^^^ 

ser Cly cm »er S.r Ly. Thr Clu Hi. H.t S.r S.r S.r S.r clv^A.n 

120S 

Ala Ly« Arg Cin Asn Cin lm\x Hi» Pro 



1220 
C 

1235 



. »i. ri« s*r Arc Str Gly Cin Pro Cin Ly« Al* Al* Ihr Cy. 
S«r $*r ^lf,<5ln 5«r Arg :»»r wxj^ ^^^^ 



I,y. V.1 S.r S.r n. A.n Cl« Clu Thr II. Cin Thr^Tyr Cy. v*l clu 

i-i^A 12bb 



1250 

X.p^Thr Pre He cy. Ph.^Ser Ar, Cy. S.r «« s.r s.r L.« Ser^ 

ser AU Clu A.P Clu II. Cly Cy. A.n Cin Thr Thr Cin Clu Al.^A.p 

1285 . 

S.r AU A,n Thr Lou Cin 11. Ala Clu II. iy. Clu Ly. lle^Cly Thr 

1300 

Xr« S.r AU Clu A.p Pro V.l Ser Clu V,: Pre AU V*l S.r Cin R.. 



1315 

pro AT, Thr Ly. S.r S.r Ar.| L.u Gin Cly S.r S.r^Leu S.r S.r Ciu 

1330 ^^^^ 

S.r AU AT, Hi. Ly. AU V.l Clu Ph. S.r S.r Cly AU ty. S« Pre^ 



1345 



1350 



ser Ly. Ser Cly aU cin Thr Pro Ly. Ser^Prc Pro Clu Hi. Tyr^v.l 

1365 

cm Clu Thr Prc^Leu Met Ph. S.r Ar^^Cy. Thr Ser V.l S.r^S.r L.u 

.,p ser Ph. Clu ser Ar. Ser lU AU Ser s.r v.l Cln^S.r Clu Pro 
1395 

cy, ser Cly H.t V.l Ser Cly lU lU ser .rc Ser^A.p Leu Pro A.p 

1410 ^^^^ 
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Ser Pre CXy CXn Tnr Pro S.r Xr, S.r^.y Pro Pro Pro^ 
1425 

Thr Lv« Xro Clu VAl ?ro tyt A«n Ly« Al* 

Pro Pro cm Thr XI. Cln Thr Ly« «9 w« ^^^^ 

1445 

* Lv« clfl Al* Al« Val Abti 

Pro Thr Al. CXu Ly. Arg Clu S« Cly Pro Ly. CXn Al ^^^^ 

i460 

XI. Al. V.1 cm Ar, v.: Cln V.1 L.U pro A.p Al. A.p Jhr Leu Leu 

1475 

Hi. Pn. Al. Thr Clu S.r Thr pre A.p Cly Ph. S.r^Cy. s.r S.r s.r 

1490 

Leu S.r Al. L-u S.r L.« A.p Clu Pro Ph. II. Cln Ly. A.p V.l Clu^ 



150S 1510 
L.U Arg II. M.t Pro^Pro V.l Cln Clu A.n^A.p A.n Cly A.n Clu Jhr 

Cl« S.r Clu Cln pro Ly. «« S.r A.n Clu A.n Cln Clu Ly. Clu Al. 

1S40 1545 

Clu Ly. Thrill. A.p S.r OU Ly.^A.p L.u Uu A.p X.p^S.r A.p A.p 
A.p A.p II. Clu 11. L.U CI- Clu Cy. II. U. S.r^Al. M.t Pro Thr 

1570 1575 

Ly. s.r S.r Arg Ly. Al. Ly. Ly. Pro Al. Cln Thr Al. S.r Ly. L.«^ 

1585 i59w 

Pro Pro pro v.l Al.^Arg Ly. Pro Ser Cln^Leu Pro V.l Tyr Ly.^L.u 

Leu Pro ser Cln^A.n Arg Leu Cln Pro^Cln Ly. Hi. V.l S.r^Ph. Thr 



1620 



1635 



Pro Cly A.P/.P M.t pro Arg v.l^Tyr Cy. v.l Clu Cly^Tnr Pro II. 

Thr a 
Ala G 

16S5 1670 



A.n Pho S« Thr AU Thr 5.r Leu Ser A.p tc. Thr 11. Clu sor Pro 

1650 1^55 



Pro A«n Clu Leu AU AU Cly Clu Cly Val Xr, cly cly Al. Cln S.r^ 
16S5 16"0 

Cly Clu Phe X.P Thr He Pro^xnr Clu Cly Arg s.r^Thr 

X.P Clu Al. Cln Cly Cly Ly. Thr S.r S.r V.l Thr II. Pro Ciu L.u 

1700 1'°* 

A.P A.p '^i- ;^^o*'^ 

ser Al. Met Pro Ly. Cly Ly. Ser Hi. Ly. Pro Ph.^Arg v.l Ly. Ly. 

1730 1'^* 

.f ^ z-^* rift xla Scr Xl« Ser S#r Ser Ala Pro Aan 
lU^Met Asp Cln V.l ^'"^ *^ 1755 1760 
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ty. *.n cm t.u MP ciy Ly- ^r- Jr;/« I^is^"' 

^ 1765 

^Kr clu Tvr ^rg Thr ^rg Vii Arg Ly« A»n XXa 
pro Il» ?ro Cln A»n Thr Glu Tyr ^9 ^ _ ^^^^ 

I7fl0 ^ 

^-f. iiA Clu Xxg Vtl Fh« Ser A»P 
XBp ser Lyi^Xin Atn L«u Aan AU^Clu Axg ^^^^ 

T t rin A.n L«u Lyi A.n k$n S.r ty. Atp Ph* Xen A«? 
Atp fi«r Lys ty» Cln A«n ^« ^-J^' igjO 
1810 1815 

X.y. X.U pro Mn *.n CI- A.p Xr, V.X ^9 |Xy^»- 
1825 1^^^ 

. •u- ii« Glu Cly Thr Pro Tyr Cy« Pht Ser 

ser Pre HU Hi« Tyr Thr Pro lie oiu ^xj^ ^^^^ 

1845 

XT, A.n A.P s.r^L.u 5er 5er L.« X.p^PH. *.p A.p X.p A.p^V.X A.p 

S« Ar, cL ty. Xia CX« t.u A., .y. Al. ty. CXu^A.n Ly. Cl« 

1875 1^®^ 
ser Clu AU ty. V.l Thr Ser Hi. Thr «1« te« Thr^S.r A.n Cln Cln 

1890 1^'^ 
S« AU A.n ty. Thr CXn AU II. XU ty. Cln^Pro lU A.n Ar, Cly^ 

1905 1'^° 

Cln pro ty. Pre II. t.« Cln ty. cln S.r Thr Ph. Pro Cln Ser^Ser 

1925 

ty. A.p n. Pro^A.p Arg Cly AU AU^Thr X.p Clu Ly. L-u^Cln A.n 

Ph. AU lU CU A.n Thr pre V.l Cy. Phe S.r Hi. A.n^Ser Ser L.« 

1955 1'^^ . 

S.r ser teu Ser A.p lU A.p Cln Clu A.n A.n A.n Ly. Clu A.n Clu 



1970 



1975 



pro ne ty. Clu Thr Clu^Pro Pro A.p S.r Cln^Cly Clu Pro S.r ty.^ 



198S 



1990 



P„ cm AU S.r cly^Tyr XU Pro Ly. S« Ph. Hi. v.l Clu A.p Jhr 



2D0S 



Pro V.1 Cy. Ph. S.r Ar, A.n S.r Ser I .u Ser S.r t.« S« lU A.p 

2020 ^^^^ 

ser Clu A.P A.P Leu L.u Cln Clu Cy. II. Ser S.r XU^H.t Pro ty. 

2035 '^^^^ 

.y. ty. ty. pro Ser Ar, teu Ly. Cly X.p A.n Clu^ty.. Hi. «.r Pro 



20S0 2055 



Xr,^X.n Met Cly Cly lU^Leu Cly Clu A.p L.u^Thr teu A.p L.u ty.^ 
r.rixe cm Ar9 Prc^X.p ser Cl-o hU cly Uu s.r Pro ..c S.r^Cl. 
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A.n Ph. A.p Trp Ly. XU !!• Cln Clji Cly AU A.n ST II. V.l S.r 

2100 2105 

S.r L.« HI. Cln XI. XL KU XU XU Cy. t.« S.r Xr| Cln XU Ser 

2115 2120 - Zizs 

ser X,p S.r X.p S.r 11. L.« 6.r !..« Ly. S.r Cly II. Ser L.u Cly 

2130 2135 2140 

5.C Pre Ph. Hi. L.U Thr Pro x.p Cln Clu Clu Ly. Pro Ph. Thr Ser 
2145 2150 2155 21'0 

A.n Ly. Gly Pro Xrfl He t.u ty- Pro Cly Clu Ly. S.r Thr L.u Clu 

' 2165 2170 2175 

Thr Ly. Ly. lle^Clu S.r Clu S.r Ly.^Cly II. Ly. Cly Cly^Ly. Ly. 

VAl Tyr Ly. S.r L.u lie Thr Cly Ly. Vtl Ar9 s.r A.n s.r Clu He 
2195 2200 2205 

s*r Cly Cln Het Ly. Cln Pro L.« Cln Ala A.n Mjt Pro Ser He Ser 

2210 2215 2220 

Arg Cly Ar? Thr M.t II. «!• He Pro cly V*l ^rg A.n S.r S« 9.r 

2225 2230 223S 2240 

S.r Thr Ser Pro V.l Ser Ly. Ly. Cly Pro Pro Leu Ly. Thr Pro/1* 

2245 2250 225> 

s«r Ly. ser Pro S«r Clu Cly Cln Thr Al* Thr Thr Ser Pro Arg Cly 
22S0 2265 2270 

Al* Ly. Pro ser v.l Ly. Ser Glu L.u Ser Pro V.l Al. Arg Cln Thr 
2275 2280 22BS 

Ser Cln lie Cly Cly Ser Ser Ly. Al* Pro Ser Arg Ser Cly Ser Arg 

2290* 2295 2300 

A.p ser Thr Pro Ser Arg Pro Al* Cln Gin Pro Leu Ser Arg Pre II. 
2305 2310 2315 2320 

Cln S.r pro Cly Arg A.n Ser He Ser Pro Cly Arg A.n Cly II. S.r 
2325 2330 2335 

Pro Pro A.n Ly. L«u ser Cln Leu Pro Arg Thr Ser Ser Pro S.r Thr 
2340 2345 2350 

Al* S.r Thr Ly. Ser Ser Cly S.r Ciy Ly. Met scr Tyr Thr S.r Pro 
2355 2360 2365 

Glv Aro Cln M.t Scr Gin Cln A.n Leu Thr Ly. Cln Thr Cly Leu Ser 
^ 2370 2375 2380 

Ly. A.n Al* S.r Ser He Pro Arg Ser Clu Au S.r Ly. Cly L.u 
2385 2390 2395 2400 

A.n cm Met A.n A.n Cly A.n Cly Al* A.n Ly. Ly. V.l Clu Leu Ser 

240S 2410 2415 

Aro Met ser Ser Thr Lye Ser Ser Cly S.r clu ser A.? Arg Ser clu 
' 2420 2425 243w 
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^, ,„ v.j^^. ... - !J3s"" 

,„ «c »' \T,r "S""' "° 

24S0 , 

« s.. «, - "* 

2465 2470 

^ - Klo"' 

„. ..1 =u «.^cw CI, w. - 

^ „. r^r 5« "* 

2515 

^ .„ s» =1- « r~ «' U«"" 

2S30 2W 

«, ixp L,. »» »" »' lUs"' 55« 

2545 

S.r Thr Txp AT, Arg Thr cly Ser S« S.r^S.r n. ..u Ser AU^»er 

2365 



XI* Lvi s«r clu A»p Clu Lys Hi* VaI A«n 
Ser Clu S«r S«r Clu Ly» XI* I-y« *- 2590 

2580 ^^^^ 

rv. Cln S«r Lyt Clu A.n Gin V*l Ser Al* Ly» 
Ser lie S«r cly Thr Ly» cin ^'^ ^^qs 
2595 

• * ^ TV. il€ Lvs Clu A«n Clu ?hc ser Pro Thr A.n st 
Cly T!ir Ttp Krg Ly» Il« * ^^20 

2610 

... .in T« V.1 |« s.r 01, ... ,« «• «' JHo 

2625 

,nr ^« XU Oln^K,. XX. Pro .1. V.l^S.r ... T« .lu X.p^V.I 
Xn> V.1 Ue^alu A.P cy. Pro Ile^Aen X.n Pro Xrg Ser^Cly Ar, 
se. pro T.r.r-n Thr Pro Prc^v.I U. Xap S.r v.l^S.r CXu .y. 



26-75 



. Lv. A.D S«r Lyt Atp Afln Cln Ala Ly« Cln A«n 

a A«n Pro Ain Il« l-y* ^'f ^^'^ ^ 2700 



Al« • ^£bc 

2690 2695 



, « w«* Th' v*'' Clv L»u Clu A«n Arg 

val Cly Jun Cly Ser v*l Pro Me. Arg Th. v..^Cl. ^^^^ 



A.„ ser Phe nebcin V.l Aep AU Prc^x.p Cln Ly. Oly Thr^Clu 

Xle Ly. pro Cly Cl« ^" I^fo"" 

2740 ^^^^ 

ST ser lie V.1 Clu Ar, Thr Pro Phe S.r Ser ser Ser^Ser Ser Ly. 

27S5 ^'^ 
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Hi. ST ST Vro ST «ly Thr^V*i M. XX. Arg VU^Tnr Pro Ph. A.n 



2770 



tyr A.n Pro 5er Pro xr, Ly. S.r St M. x.p St Thr S« XU Xr,^ 

pro 5« cl« n- Pro Thr Pro V.l X.n X.n x.n Thr Ly. Ly. Xr, x.p 

2g05 2810 ^o^a 

ST ty. Thr X.p ST Thr Cl« St S.r Cly Thr Cln S.r Pre Ly. Xr, 
2820 2825 ^ojv 

Hl« S«r Cly s«r Tyr L€u Vtl Thr S«r Val 
28^5 2840 

(2) INrORMATIOK FOR SEQ ID NO: 8: 

(i) SEQOTNCE CHXRACTTRISTICS: 

(X) LIHCTH: 31 4ttino 4Cldt 
<B) TtVZt AAino *eid 

(C) snUWDIDNtSS: linglt 

(D) TOPOLOGY; iin«ar 

<ii) HOlSOJlt TYPE: p«ptid« 

ivii) IKM£OliTE SOCTCC: 

<B) CLOKE: ral2(yca«t) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

L»u Thr Cly AU Ly» Cly tcu Cln L«j Ar9 Ala Leu Arg Arg !!• Ala 
1 5 15 

Arg 11« Clu Cln Cly Cly Thr Ala II. S.r Pro Thr Ser Pro Leu 
20 25 30 

(2) INrORMATION FOR SEQ ID K0:9: 

fi) SEQUENCE CHARACTERISTICS! 

(A) IXNCTHi 29 amino acida 
(8) TYPE: amino acid 
(C) STRANDEONtSSt Single 
(0) TOPOLOGY: linear 

{ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo aapiena 

(vii) IKKZDIATE SOURCE: 

(B) CLONE; m3(frAChR) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Leu Tyr Trp Arg lie Tyr Ly» Clu Thr clu Lya Arg Thr Lye Clu Leu 
- « 10 15 
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ear Cly Thr Clu Al« Clu Thr Clu 

30 

(2) XHFORMAIIOH FOR StQ 10 "O'"' 

' ' (Ji) LEKGTH: 29 »>^LM .cldi 

IC STRANDEOHtSSt •Ingle 
(D| TOPOIOCXI lln««* 

(ii) MOLtCULI TTPE: peptW* 

,xi) SEQUXHCE DESCRIPTION. SEQ ID HO:10: 

I..U Al* Clu Clu Axg Ser Arfl Trp «« ty. clu leu 
l.eu Tyr Pro *•«> *** lo 



1 5 
M. «y l-u xr, clu clu A.n Clu i-u Thr Al. H«c 

20 



(2) IHTORKATIOM TOR SEQ ID NO:U: 

It) SEOUEHCE CHARACTERISTICS: 

* ' (S) LEKCTHi 40 b-.e pair. 

(B) TYPE-, nucleic mcid 
tC STRAMDEDHESSi single 
(D) TOPOLOfiTs linear 

(ii) MOlXCUtE TYPE« eDUA 

ivLt ORICINAl. SOCRCEi 

(A) organism: Homo .apieni 

(Xi) SEgUEHCE description: SEQ ID NO: 11: 
CTATCAAGAC TCTOAC-TTT AATXCXAGTt TATCCATTTI 
(2, INPORMATICN FOR SEQ ID NO: 12: 
Ii) SEQOTNCE CHARACTERISTICS: 

* (A) tEHCTH: 40 ba.- pair« 

S TYPE; nuel«l-c .cid 
/C STRANDEDNESS: eingle 
(D) TOPOWWYi linear 

(11) MOLECULE TYPE: cDKX 

ivi) ORIGINAL SOURCE: 

(A) ORCAKISK: Hooo aapiens 



4 
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ixi) SEQUEKCE OESCRIPTION: SIC ID H0:12 
TTTXCXATTT CXTCTTAXTA TATTOTCTTC TTTrrXACXC 
{2) IHrORHXTION rOR &tQ ID NO: 13: 

(1) 5EC0ENCE CHXRACTKRJSTICS: 

(A) LtNCTHt 40 ba»« P*ir» 

(B) TTFE: nucl«lC «ei<5 
<C) STRAKDCOHESS: «in9le 
(D) TOPOLOCT: Iln»*r 

(ii) MOIXCUIX WE: CDNA 

(vi) ORICIKAL SOURCE; 

(A) ORCAKISK: Homo •Api«n» 



ixi) SEQUEKCE OESCRmiON: SEQ ID NO: 13 
CTACATTTTA AAAACCTCTT TTAAAATAAT TTTTXAACCT 
(2) INFORMATIOJI FOR SEQ 10 K0:14! 

(i) SEQUEKCE CKARACTERISTICS: 

(A) LENGTH; 40 b*»« paixi 

(B) WE: nucleic ACid 

(C) STRAHDEONESS: •in^l* 
(0) TOPOLOGY: linear 

(li) MOLECULE TYPE: cONA 

(vi) ORIGINAL SOURCE! 

(A) ORGANISM: HofflO •epi«n© 



<xi) SEQUENCE DESCRIPTION: SEQ ID HO: 14: 
AACCAATTCI TCTATAAAAA CTTCTTTCTA TTTTATTTAC 
(2) INrOWiATIOK fOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENCTHt 40 bate pelr« 

(B) TYPE: nucleic ecid 

(C) STRANDEDNESSt minqU 
(0) TOPOLOCrs linear 

(ii) MOLECULE TYPE: cDNA 

(Vi) ORIGINAL SOURCE! 

<A) ORGANISM: Homo •apien* 



<xl) SEQUENCE DESCRIPTION: SEQ ID NO: IS: 
CTAACTTTTC TTCATATAGT AAACATTCCC TTCTCTACTC 
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(2) INPORMATIOH fO* SKO » "O'l'^ 

S TYPE: Buel.lc 
C STJUUTOEDKESS: BingXt 
(D) TOPOLOGY: llnWr 

(ii) KOLtCUIX TYPE: cDMA 

(xi) SEBOXHCE BESCRIPTIOK: SEQ " "0.16: 
^HHHK. KHHCTCCCn TTTTTAAAAA AAAAXAATAC 
(2, IMroWlATIOH FOR SEQ IB "0=1^' 

' ' 7m LENCTB: 40 b««e P»irt 
B TYPE: nuciele .cxd 
fC) STRAKBEOHESS: ningl* 
(D) TOPOLOCY4 linear 

{iij KOLECtJW TYPEl ePWA 

ivll ORIGINAL SOORCE: 

* C*) ORCAKISK: Hooo .«p»«n. 

,xi) SEQOtNCE DESCRIFTIOK: SEQ ID f^O:ll: 
CTAACTAACT T<«CA«ACA ACITATTTGA AACTTTAATA 
(J) IHFOBMATION POR SZ9 » 

(i, SEQOEHCE CKARACTERISTICS: 

(A) WKCTH: 40 P*^" 

(B) TYPE: nttcl«lc *cid 

(C) 6TRAMDE0HESS: lingl* 
(P) TOPOLOGY: Iin«»r 

(11) KOtSCTU TYPE I eDHA 

(A) ORCAMISM: Hooo e»pt*n« 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:18: 
ATACAAOATA TTCATACTTT TTTATTATTT CTCCTTTTAC 
(2) INFOBiJATlON FOR SEC ID NO: 19: 

111 SEQUSNCE CHARACTERISTICS: 
' ' (?) LENGTH: 40 ba«e p«vre 
IS) TYPE: nucUic acid 
(C STRANOEONESS: exngle 
(0) TOPOLOGY! llnedr 



40 



40 



40 
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iii) HOLECULt rXPZi cOMA 

ivi) ORICIWXL SOURCi: 

(A) ORGANISM: Honio fApi«ni 

(xi) SEQUEKCE DtSCRlPTlON: SZQ ZD NO: 15: 
CTXACTTACT TCTTTCTAAC TCXTXAAXCA CrCAACACCT 
(2) 1KT0W4AT10N FOR StQ ID NO: 20: 

fil SEQUENCE CKA31ACTERIST1CS: 

(A) LEKOTH; 40 P*ir« 

(B) TYPEi »ucX«le acid 

(C) STJUUfDEDKESSt •lngl« 

(D) TOPOLOCyj lin«»r 

(ii) MOIXCCLE TYPE: cOKA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: HomO fApiAnt 

<xi) SEQUENCE DESCRIPTION: SEQ ID NOt20j 
AA7AAAAACA TAACTAATTA GGTTTCTTCT TTTATTTTAC 
(2> INFOBiiXTIOi; rOR SEQ ID NO: 21: 

(i) SEQUEKCE CHARACTERISTICS: 

(A) LENGTH: 40 b««« pair* 

(B) TYPE: nucl«lc tcifi 

(C) EXRANDEDNtSS : singlt 

(D) TOPOLOGY; linear 

<ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo smpitns 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:21; 
CTTACTAAAT TSCCTTTTTT GirTCTCGGT ATAAAAATAC 
(2) INFORXATIOM TOR SEQ ID KO:22: 

(i) SE5UENCE CHARACTERISTICS: 

(A) LENGTH: 40 b*»e pairf 

(B) TYPE: nucleic «cld 

(C) STRANDEDNESS : »in9lc 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDKA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapient 
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(xi) SEQUENCE OESCRIFTION: SZQ ID HOi27t 

XCCATTTTTC CATCTACTCA TCTTAACTCC XTCTTAACXC 

(2) inrORMATION FOR SZQ ID HO; 23s 

fi) SEQUENCE CKAHACrrERlSTlCS: 
(X) LENGTH: 40 b»»e pair* 

(B) TYPE: nuclfic acid 

(C) STRANDEOKESS: •xngle 
(0) TOPOLOCTt lin«Ar 

(ii) MOLECmX KPEt COKA 

(vi) ORIGINAL SOURCE: 

(A) ORCANISM: Horao •apieni 

(Xi) SEQUENCE DESCRIPTION: SEQ ID KO:a3: 

CTAAATAAAT TATTTTATCA TATTTTrTAA AATTAITTAA . 

(2) IKFORMATION FOR SEQ ID NO: 24: 

Ci> SEQUEKCE CHARACTERISTICS: 
(A) IXNCTH: 64 ba»« P*xr« 
<B) TYPE* nucl«ic Acid 

(C) STRANDEOKESSi •lngl« 

(D) TOPOLOGY I linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOORCE: 
^ ' <A) ORCAHISM: Home taplens 

<xi) SEQUEKCE DESCRIPTION: SEQ ID NO: 24: 
CATGATCrrA TCTCTATTTA CCTATACTCX AAATTATACC ATCTATAATG TCCTTAATTT 

TTAG 

(2> IKFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENOTB: 52 b»«e P*ir« 

(B) TYPE: nucleic acid 

(C) STRAKDEDNESS: •ingle 
(D> TOPOLOGY: linear 

(ii) KOLECOLE TYPE I cDNA 

(vi) ORIGINAL SOORCE: 

(Jk) ORGANISM: Homo .apiene 

(Xi) SEQUENCE DESCRIPTION; SEQ ID KO:25: 
CTAACAGAAG ATTACAAACC CTGCTCACTA ATGCCATCAC TACTTTCCTA AG 52 



60 
64 
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(2) IKrORKXTIOH FC* SSQ 16 

(B) mx: nucleic 

C STRAOTIOWSSt tingle 
(0) TOPOIOCY: linear 

(ii) MOLECOW TTPt: eOHA 
(vil ORIClKAt SOURCti 

SEQUEKCE DESCRIFTIOH; SEQ » »0.26. 
CCXXATTiUU. CT»T*XTTT.TCTTXCX*X* CXCXTTTOCC CCXCXC 
(2) XHrOMIRTION WR SEQ » M0.27t 
111 5E0WEHCS CKXRACTER"""' 

B) TTPI: nucleic ecid 

(C) STBAHDEOKtSS: ilngle 
(O) TOPOtOSYi imeer 

(it) KOLECU1.E TYPES cOHA 

>«ti CRICXNAL SOURCE: 

(xl) SEQOtHCE OESCRIPTIOK: SEQ ID M0::7: 

6TXTCTTCIC TXWCT6TAC XTCCTACTCC XTGITTCXAA 

,2J INTORKATION POR SEQ 10 N0.28: 

ii> SEOOEKCE CHARACTERISTICS: 
<A) LEHCTH: 56 b«ie p.«.rs 
(B) TTPB: nucleic ecW 
Jci STWUlCEPNE«s tingle 
(D) TOPOUJCy: linear 

(11) MOLECUU ITPE: cDMA 

ivi) ORICIKAL SOURCT: 

' (X) ORCAMISM! HOBO tapien* 

(xi> SEQUENCE OtSCPIPTION: SEQ ID H0.28: 
CATCATTCCT CTTCAAATAA CAAACCATTA T«TTTAT«T TCATTTTATT TTTCAC 
(2) INPORKATION POP SEQ « NO 129: 

(i) SEQUENCE CHARACTERISTICS: 
' ' (i) LENGTH: 43 bat. pair. 

(B) TTPEi nucleic acid 

(C) STRANDEOKESS: tingle 

(D) TOPOLOGY: llntar 
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(ii) KOtrCOXX TYPt: cDNA 

ivt) ORICIKAL SOORCX: 

(A) OMAKISK: Bckpo ■tpi«n» 

(Xi) SKQUENCE OtSCRIPTIOK: SKQ ID KO:29: 
CTAACACXAA AATC7TTTT7 KATGACXTAC ACAATTXCTC CTC 
(2) jMrORMATIOM FOR SZQ ID mtZOt 

(2) LTKCTB* 40 b.«« pair* 

(B) TTPKi nuclsic acid 

(C) STRAKOWHBSSi iingl* 

(D) TOPOlOCri Un««r 

(ii) MOLECULE TYTEt CONA 

(vi) oRiciKAL somat 

(A) ORCAKISMi HODO .api^n* 

(xi) SEQOrtfCE DESCRIPTION: SEQ 10 KO:30: 

TTACXTQATT CTCTTTTTCC TCTTCCCCTT TTTAAXTTAC 

(2) INTORKATIOK FOR SEQ 10 K0:31: 

Ii) SEQOEHCE CHARACTERISTICS; 
^ (A) LENGTH: 44 t>«Cft P*xr» 

(B) TYPEi nuelalc acid 

(C) STRAOTEOHBSSs »in9l« 
(0) TOPOLOGY I ltft«4r 

(ii) MOLECULE TTPE: CDHA 

fvi) ORIGINAL SOURCE: 

(A) ORCAKISM: Homo etpiant 

(Xi) SEQUENCE DESCRIPTION: SEQ ID HO: 31: 

CTATCTTTTT ATAACATCTA TTTCTTAACA TACCICACCT ATCA 

(2) INFORMATION FOR SEQ IC NO: 32: 

Ml SEQUENCE CBARACTERISTICS: 
{A) XENCTH: 54 baa* pairf 

(B) TYPE: nucleic acid 

(C) STRAMDEONESSj aingle 

(D) TOPOLOCX! llnatr 

(ii) MOLECULE TTPE; cDKA 

(Vi) ORIGINAL SOURCE; 

(A) ORCAKISM: Kctnc tapiene 



43 



40 



44 
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SEOOTHCE OKCRIMIOK: SEO » MO""-- 
CCrrCCCTTC IU«:TTCHCTT TTt*AtCXtC CTCTXTTCTC rXTTTXATTT .CA« 
(2) IHrORHATlOK FOR SEQ ID HO: 33: 

(i) SEQUtNCt CHAJUCTERISTIC5: 

(B) TYPE: nucleic 

(C) STRAHPBDKtSS; 

(D) TOPOLOGY: linear 

(ii) MOLEWIX TXPXJ COMX 

(vi) ORICXHXL SOTOCIJ .^^i... 

(A) DRCAKISH: Komo e»pi.«n« 

(xi) SEQUWCr OKWimOH: SEQ ID l«3:33: 
CXACTATrTA OAATTTCACC TCTTTTTCTT TTTTCTCTTT TTCTTTaACO CACOOTCTCA 60 

65 

CTCTC 

i2) IN70W4ATIOW TOR SEQ ID WO: 34: 

(i\ SBQTOKCE CHARACTBKISTICS; 
^ ^ \i) LEKGTH: 52 b*se ptUf 

(B) TYPE: nuel«ic acid 
fC) STRXKDEDHESS : »a.ngl« 
(D) TOPOLOGY: linear 

(ii) MOIiCULE TYPE: cDNA 

fvi) ORIGINAL SOURCE: 

^ ' <A) ORGANISM: Homo aapian. 

<xl) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GCAACTACTA TGATTTTATG TATAAATTXA TCTAXAATTG ATTXATTTCC AC 57 
(2) INFORMATION FOR SEQ ID HO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHx 42 baae p«tr« 

(B) TYPE: nuclaic *cid 
STRAKDEDHESS! •ingle 

(D) TOPOLOGY: lin«Ar 

(ii) MOLECULE TYPE I cONA 

ivi) ORIGINAL SOURCE: 

(A) organism: Homo sapiene 

(xi) SEQUENCE DZSCRIPTIOK: SEQ ID N0:35. 

ctacctttga aaacatttag tactataxt^ tcaatttcat ct <2 



40 
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(2) INFORMATION FOR SM ID HO;36i 
ii) SCQUCKCr CHARACTERISTICS: 
(A) tEHCTHi 40 PJ^*^* 
)b) type: nucleic Afld 

(C) STRAKDEDKXSS: -ingle 

(D) TOPOLOGY: linear 

(ii) MOtXCULE TYPEi COKA 

«vi) ORIGINAL SOURCE: 

^ (A) ORGANISM! HottO .apxenf 

<X1) SEQUENCE OESCRXPTION: SEQ 10 KOt36: 
CCAACTCNAA TTACATCACC CATATTCACA XACTTACTAC 
(2) IHPORMATION FOR SEQ ID 

fi) SEQOEHCE CHARACTERISTICS! 

(A) LENGTH f 54 bwe p*ir« 

(B) TYPE: nucleic ecid 

(C) STRANDEONESS: txngle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE J cDKA 

(Vi) ORIGINAL SOURCE: 

{A> ORCAKISH: Romo eapieni 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
OXATATATAG ACTTTTAXAT TACTTTTAAA CTACACAATT CATACTCTCA AAAA 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEMCTHi 41 base paire 

(B) type: nucleic acxd 

(C) STRANDEDNESSt single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOL-RCE: 

^ (A) ORCANISMi Hotto .apiena 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 
ATTCTGACCT TAATTTTCTC ATCTCriCAT TTTTATTTCA C 

(2) INFORMATION FOR SEQ ID NO: 39: 

(L\ SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 baae paxr. 

(B) TYPE: nucleic ACid 

(C) STRANOEONESS: tingle 

(D) TOPOLOClf: lloe^r 



41 
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(ii) MOLtCULE TXPt: cDNA 

(vi) ORICINXL SOURCt: 

(A) ORCAKISH: Homo •tplent 

(xi) SEQUENCE DESCRimON; SEO 10 NO:39: 
TCCCCGCCTC CCCCTCTC 
(2) XKTORMATION rOR SEQ ID HO$40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 b*»« ptirf 

(B) TYPEi nueUic *cid 

(C) STRAHDEOKtSS: tingle 
<D) TOPOLOC^f: linear 

(ii) KOLECULE TYRE: cDNA 

(vi) ORIGINAL SOURCE! 

(A) ORGAHISK: Hocno •4picn» 

(Xi) SEQUENCE DESCRIPTION I SEQ ID KO:40: 
CCACCCCCCC CTCCCCTC ^® 
(2) IKFORKATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 b«i« p4irf 

(B) TYPE: nucleic acid 

(C) STRAOTEDKESS: tingl* 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDMA 

(vi) ORIGINAL SOURCE: 

(A) ORCAKISM: Homo •apiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
CTGAACCCCT CTCATGCTGC 2° 
(2) INFORMATION FOR SEQ ID NOj42i 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 19 b*«a pairs 

(B) TYPE: nucleic acid 

(C) STRAKOEDNESS: ain^la 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDHA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hooio taplena 
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^Kl, SEQtTtKCE DESCRXFTIOH: SEQ ID K0.42t ^ 
ACCTCCCCCC ACCXATCCX 
^2) iHrORMXTIOK TOR SEQ ID HO:43i 

(t) SEQUEMCE CKARACTCTISTICS: 
* <A) LENGTH: 24 bAM P*i" 

<B) WPEt nuei«ic 

(C) STRAOTEDHESS: tingle 

(0) TOPOLOGY: line*r 

(ii) MOLECULE TtPE: cDHA 

ivi) 0RI61KAL S0URC3: 

' ' (A) ORGAHISM: Hocdo •«pien» 

SEQUENCE OESCRlPTIONt StQ IP NO: 43: 
ATCATATCrr ACCAAATCAT ATAC 
(2) INFORMATION ICR SEQ ID HO:44: 

iL\ SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 ba«e p*xr« 

(B) TTPE: nucl«ic ACid 

(C) STRAKDEONESS: ttngle 
(0) TOPOLOCr: linear 

(ii) MOLECUU; TTPEi CDNA 

rvi) ORIGINAL SOURCE: 

^ (A) ORGANISM t H<»o ..pxan. 

(Xi) SEQUENCE DESCRIPTION! SEQ ID NO: 44: 

2: 

TTATTCCTAC TTCTTCTATA CAC 

(2) INFORMATION POR SEQ ID NOs4S. 

(L) SEQUENCE CHARACTERISTICS: 

(A) LKNCTR; 21 P*ir« 

(B) TYPE: nucl«ic *cid 
CC) STBANDEDHESS: •ingle 

(D) TOPOLOGY; linear 

(li) KOLECUL: TYPE: cDNA 

(Vi) ORIGINAL SOURCE: 

(X) ORGANISM: Homo «*pieni 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:45: 

2: 

TACCCATCCT GCCTCTTm C 
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(2) XHrORKMIOW rOR SZQ 10 NOi46; 

(i) StOUZNCE CKARACXrRlSTICS: 

(A) IXNCTK: 20 *>*•• P»ir» 

(B) TYPt: nucl«iC 4Cid 
CO STRANOIOKtSS: •in9l« 
(D) TOPOLOGY: linttr 

(ii) MOLECULJ TYPt: cDNA 

(vi) OmCINXL SOORCEt 

(A) ORCANISKf Komo •«picnt 

(Xi) SECUENCI DESCRIPTION: SEO 20 KO:46: 
TCCCCCCXTC TTCTTCCTCA 
(2) INFORKATION FOR SEO 10 NOj47: 

ii) 6EQ0CKCS CHARACTERISTICS: 

(A) LENGTH: 22 bfttt pAiri 

(B) TTPE» nucl«ic *cid 

(C) STRANOEDNESS: sin?!* 

(D) lOPOLOGT: Ixnmtz 

(ii) MGLECCI^ TTPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORCANISMx Homo Bapient 

(Xi) SEQUENCE DESCRIPTION: SEQ ID MO!47t 
ACATTACCCA CAAACCTTCC AA 
(2) IKFORKATION FOR SEQ 10 NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENCTHi 22 b««e pair* 

(B) TTPE: nucltic tcid 

(C) STRAKOEDKESS: fingl* 
(0) TOPOLOGY: linoar 

(ii) MOLECCLE TTPE: CDNA 

(vi) ORIGINAL SOURCEx 

(A) ORGANISM: Homo »«pi«n« 

(xi) SEQUENCE DtSCRIPTION: SEQ ID NO; 48: 
ATCAAGCTCC ACTAACAAGC TA 22 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 b4ee p*ir« 

(B) TYPEi nucX«ic «cld 

(C) STRANDEDNESS: •in9le 

(D) TOPOLOGY: linear 
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(ii) MOLCCOLK TYPE: ^^^^ 

(Vi) ORIGINAL _ .aoien- 

(X> ORCXKISMi Homo •«pten» 

(xl) SEQUENCE OESCKlKtlOH: 3E0 ID N0.49« 
TCCGCCTCCT CCCTTCTTC 
(2) INFORMATION rOR StQ ID KO:50j 
(1) SEQUENCE CKWUCTER"TICS; 

(B) WE: aucl«ic «cid 
iC) STRANDIDHESS: •inglt 
}dJ TOPOLOCt: lin«*r 

(li) MOlXCUtt WP»: cDNA 

(Vi) ORICINAt source: .^^^ 
<A) ORCAKISK; Hooo »api»n« 

(xi) SEQUENCE DESCRIPTIOH: SEQ ID HOiSO: 
CCCCCTTCCr TTCTCACCAC 
(2) IwrORKATIOH FOR SEQ ID KG: 51: 

fi) SXQUENCB CHARACTXRISTICS: 
(A) LBNCTHi 21 P*i" 
/B) TYPE: nucl«ic *cic 

(C) STRANOEOKESS: Single 

(D) TOPOiOCT: linear 

(ii) KOtXCULE WPE: cDNX 

/vi) ORIGINAL SOURCE: 

(A) ORGANISM; Homo «apien» 

(Xi) SEQUENCE DESCRIPTION: S2Q ID K0:51: 

21 

TTTTCTCCTG CCTCTTACIC C 

(2) INFORMATION fOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 
^ (A) riNCTH: 20 bdM P»tr« 

(B) TYPE: nucleic tcid 

(C) STRANDEONESS: ■in9le 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDHA 

(Vi) ORIGINAL SOURCE: 

^ (A) ORGANISM: Homo sapiens 
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(Xi) SrOtJEMCr DESCRIPtlOK: SCO " ^Oii2i 

20 

ATGACACCCC CCATTCCCTC 

(2) INrOWiATION FOR SZQ IP NO: 53: 

(i) SEQUENCr CKXJUCTtRlSTlCS: 
(X) LENCTHj 24 b«M P*i-'» 

(B) im: nucleic acid 

(C) STJUKOEDKESS: •ingl« 

(D) lOPOLOCT: linear 

<ii) MOLTCOLK TTFI: COKA 

(vi) ORICXHAL SOimCti 

CA) ORCAKlSHx Homo •*pi«ni 

<xi) «»SOTKCr DESCRIFTIOH: SEQ 10 KOi53: 
CCACTTAAAO CACATATATT TACT 
(2) INFORMATION FOR SEQ 10 HOtS4: 

fi) EEQUEKCE CHARACTERISTICS: 
<A) LENGTH* 22 bAM pair* 

(B) TIPE: BttCltic acid 

(C) STRAKOEOKESS: tingXt 
(0) TOPOLOCTs lin«*r 

(ii) MOLECULE TTPE: CONA 

(vi) ORICXKAL SOURCE: 

(A) ORCANISMi Homo sApiant 

(xi) SEQUENCE DESCRIPTION: SEQ ID KO:54: 

CTATCCAAAA TACTCAACAA CC 

(2) INFORMATIOH FOR SEQ ID N0:5S: 

(i) SE002KCE CHARACTERISTICS: 
<A) LENGTH: 24 b*»« pairfl 

(B) TYPE: nucleic acid 

(C) BTBANOEDNESSt wingln 
(0) TOPOLOGY: linear 

(ii) MOLECULE TTPE: cOKX 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Bono »*picne 

(xi) SEQUENCE DtSCRI?TION: SEQ ID NOibS: 
TTCTTAACTC CTCTTTTTCT TTTG 2< 
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(2) INrORMATION FOR SIQ 10 WiHi 

CD SMOENCt CHARACnWSTICSt 

(A) LWCTH: 23 b*»t 

(B) TYPri nucltlc «cid 

(C) STRANOroKISS: iio^le 

(D) TOPOLOCy: linitr 

(ii) KOIXCUIX TYPt: COHA 

(vi) oRiciKXL somctt 

IX) ORCAKISM: Booo .tpitnt 

(xi) SBgUIKCE DCSCRimOH: SEQ ID NOt56 

TTTACAACCT TTTTTCTCTT CTC 

(2) INFORMATION FOR SEQ ID N0i$7i 

tL^ SEQUENCE CKXRACTERISTICS: 
(A) LENCTHi 24 pair* 
(8) TYPE: nucl«ic acid 

(C) STRAWOEDHESS: flngl* 

(D) TOPOLOGY; line*r 

(ii) KOXpECOLE TYPE: cONA 

(vi) ORIGINAL SOORCE: 

(A) ORCANISK: Booo ««px«nfi 

(Xi) SEQUENCE DESCRIPTION! SEQ ID NO:S7: 

CTCACATTAT ACACTAACCC TXAC 

(2) INFORMATION FOR SEQ ID N0:5£: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LSKCTK5 22 b*«« p4i.re 
(8} TYPE: nucleic *Cid 
(C) STRAKDEDRESS: lingl* 
(0) TOPOLOCY: lin««r 

(11) MOLECULE TYPE: cOKA 

(Ti) ORIGINAL SOURa: 

(X) ORCAKISK: Boao B4pi«n« 

(Xi) SEQUENCE DESCRimONi SEC ID NO: 58: 
CATCTCTCTT ACAGTAGTAC CA 
(2) INFORMATION FOR SEQ ID NO:S»t 

(i) SEQUENCE CHARACriRISTICS: 

(A) LENGTH: 20 p*ir« 

(B) TYPE: nucl«ic *cid 

(C) STRANDEDKESS: einglc 
(p) TOPOLOGY: linear 
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{ID MOLtOnX TXPti COKA 

(vi) OMCINXL SOURCti 

(A) ORCAHISK: Homo •«pl«fti 



(xi) StOOEHCf DESCRIPTION: StQ ID HO: 59: 

ACC7CCAACC GTA6CCXACG 

(2) INTORMATIOM FOR StQ ID NO: 60: 

(i) SEOOEHCE CHARACTERISTICS: 
<A) LIHCXK: 27 b«»« p*ir« 
(»} TTPts nucleic «cid 

(C) STRAKDEDHrSSi ■in9l« 

(D) TOPOLOCT: linear 

(il) MOtECULt TTPCx CDNA 

(▼i) ORIGINAL SOORCE: 

<A) ORCANISMs HO«o nupiwn 



ixi) SEQOENCE DESCRIPTION: SEQ ID N0:60: 

TAAAAATGGA TAXACTACAA TTAAAAC 

(2) INFORMATION FOR SEQ ID K0:61: 

(i) SEQUENCE CHARACTERISTICS: 
(A) IiENCTH: 24 b*t« petrf 
it) TTPE: nucleic ecid 

(C) STRAKDEDNESS; e ingle 

(D) TOPOLOGY: linetr 

(11) MOLECULE TYPE: cONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; Hcrno eapien* 



(Xi) SEOOENCE DESCRIPTION: SEQ ID NO: 61: 

AAATACACAA TCATGTCTTC AACT 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENTE CHARACTERISTICS: 
<A) LJNOTH: 23 b«»e p«ir» 
(B) TTPE: nucleic ecid 
<C) STRAKDEDNESS: eingle 
it) TOPOLOGY: linear 

(ii) MOLECULE TYPE I cDNA 

(vl) ORIGINAL SOURCE! 

(A) ORGANISM; Homo Bapieni 
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<xi) SEQUENCE DISCRIPTIOK: SEQ ID NO:62: 
ACXCCTXXAC ATCACXATTT CAC 
(2) IKFORHATION FOR StQ 

li) SEQUENCE CHARACTERISTICS: 
* ' (A) LENGTH; 24 ba.* P*ir. 
(B) XrPE: nucleic 
fC) STRAHDEONESS: •ingi* 
(0) TOPOLOCT: linear 

(ii) MOLECULE TTPE; cDNA 

(vi) ORIGINAL SOORCEi . 
^ ^ (A) ORGANISMi Horao .apient 

(xi) SEQUENCE DESCRIPTION: SEQ I© N0.6J. 
TAACTTACAT ACCACTAATT TCCC 
(2) IKrORHATION FOR SEQ ID NO:64! 

fi) SEQUENCE CHARACTERISTICS: 
CA) LENGTH: 23 ba.c paira 

(B) TYPES nucleic acid 

(C) STRANDEDKESS: single 

(D) TOPOLOCT; linear 

(ii) KOLECDLE TTPEi cOHA 

fvi) ORIGINAL SOURCES 

(A) ORGANISM: Hofflo aapi«n« 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: 
ACAATAAACT CGACTACACA ACC 
(2) rwrOWIATTON FOR SEQ ID KO:65: 

li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 baae paxra 

(B) TTPE: nucleic acid 

(C) STRANDEONESS: Bingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

^ (A) ORGANISM: Hotno aapiens 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NOi65 
ATACCTCATT CCTTCTTGCT CAT 
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(2) iKrORHXTXOK fOR «Q NO 5 66: 

(i) SEQUENCE CHARACTERISTICS: 
(A) tENCTHi 24 P*lri 
CB) TYPEi nucleic «cid 

(C) STRANOEOHBSSi BingXe 

(D) TOPOLOGY t linear 

(ii) MOLECULE TYPE: CONA 

(vi) ORICIKAL SOURCE: 

(A) ORGANISM: Homo tapicne 



{Xi) SEQUENCE DESCRIPTION: SZQ ID NOJ66: 
TCAATTTTAA TCCATTACCT ACCT 
(2) INFORMATION FOR SEQ ID NO:S7: 

(i) SEQUENCE CHARACTERISTICS! 

(A) LENCTHt 25 b«»e p*irt 

(B) TYFE: nucleic 4cid 

(C) STRANDEDKESS : 9ln9Le 

(D) TOPOLOcy: linear 

(ii) MOLECULE TTPE: cDNA 

(Vi) ORIGINAL SOURCE: 

(A) ORCANISMi Homo sapiens 



(xi) SEQUENCE DESCRIFTION: SEQ 10 NO:67i 
CTTTTTTTCC TTTTACTCAT TAACC 
(2) INFORMATION FOR SEQ ID NOt68! 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TX?Zt nucleic acid 

(C) STRAKOEOHESS: aingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hofno sapiens 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
TCTAATTCAT TTTATTCCTA ATAGCTC 
(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKOEDNESS : single 

(D) TOPOLOGY; lineir 
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(U) MOLTCULr TYFE: CDKX 

(vi) ORICIKXL SOURCr: 

(A) ORCXHlSMi Homo •apt.ni 

(Xi) SEQUENCE DESCRimON: SEQ ID KO:69: 
CCTACCCXTA CTATCATTAT TTCT 
(2) IKTORMXTION FOR SEQ It> HO: 70: 

(1) SEQUBHCE CHARACTERISTICS: 
(A) LENCTS: 34 bA»« ptir« 
(!) TYPE: nucleic Actd 
fC) STRAKDEOKESSj •in9l« 
(D) TOPOLOCrt linear 

(li) MOIXCUIX TTPEi COKA 

fvi) ORIC2MAL SOURCE J 

(A) ORGANISM: Horoo eapien* 

(xl) SEQUENCE DESCRIPTION: SEQ ID N0«70: 
CXACCTATTT TTATACCCAC AAAC 

(2) INFORMATION FOR SEQ 10 N0i71: 

(i> SEQUENCE CHARACTERISTICS: 
(A) LENGTH! 23 b««e pairs 
(Bl TTPE; nucleic acid 
(C) STRANOEDNESS: «ingi« 
(0) TOPOLOGY: linear 

(11) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
AACAAACCCT ACACCATTTT TCC 
(2) INrORHATION FOR SEQ ID NOt72: 

(11 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 b«»e paire 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 
{D) TOPOLOGY: linear 

(il) MOLECULE TYPE: cDNA 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM: Homo BApien* 
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(Kil SEQUENCE OESCRIPTIOH; SE5 NO! 72: 
CKTCATTCTT ACXACCXTCT TCC 
{2) INrORKATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CKWlACTERISTICSi 

(A) LEKCTHt 24 t>ao« P*iri 

(B) TTPE: nucleic «cid 

(C) STWLVDEDNESS: single 
(D> TOPOLOCt: lln«Ar 

(il) MOLECUT-E TYPE; CDNA 

(Vl) ORIGINAL SOURCE: 

(A) ORGANISM: Homo »»pi«ns 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
ACCTATACTC TAAATTATAC CATC 
(2) INFORHATIOK rOR SEQ ID N0:^4; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 b*s« p«ir« 

(B) TTPE: nucleic eciti 

(C) STRAKDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cONA 

(vi) ORIGINAL SOURCE t 

(A) ORGANISM: Homo eapiena 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: 
CTCATCGCAT TACTCACCAC 
(2) INPORMA7IOH FOR SEQ ID NO:7£: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 bate pairs 

(B) TYPE: nucleic acid 

(C) STRANBEDKESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

<vi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo B«picn» 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
ACTCCTAATT TTCTTTCTAA ACTC 



24 
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(2) IHrORKXTIOK FOR StQ « NO: 76: 

B) TXnz nucUic acid 

(0) TOPOLOGY t linear 

(ii) MOLECULE TTPEz CONA 

*vi> ORIGINAL SOURCE: 

^ (A) ORGANISM: Homo tapien. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: 
TCAACCACTC CCATTTCACC C 
(2) INFORMATION FOR SEQ 10 NO:77: 

Cl) SEQUENCE CHARACTERISTICS: 
(A, LENGTH: 23 b*i* pairi 

(B) TYPE: nucleic ACia 

(C) STRANOEDNESS: •ingU 
(D> TOPOLOGT: linear 

(ii) MOLECULE TTPE: cONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: HOffiO fapieni 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 
TCATTCACTC ACACCCTCAT GAC 
(2) INFORMATION FOR SEQ IP NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENCTHi 22 t>M pa^» 

(B) TYPE: nucleic acxd 

(C) STRANDEDHESS: .inglfe 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

fvi) ORIGINAL SOURCE: 

(A) OrOANISMi Hocno sapient 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:7e: 
GCTTTGAAAC ATCCACTACC AT 
(2) INFORHATIOH FOR SEQ ID NOi79i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 b*8C p4iri 

(B) TYPE: nyclaic acid 

(C) STRAKDEDNESS : single 

(D) TOPOLOGY: linear 
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(il) MOLSCOLE TTPt: CDHA 

(vi) ORICZMAL SOURCE: 

(A) 0RCXNI5M: HOBO •4pi«n» 



(xl) SEQUEKCC 0E3CJ11PTI0K; SBQ ID KOt79i 

AAACATCXTT CCTCTTCAXA TXAC 

(2) INFORMATION POR S£Q ID NO:S0: 

(1) SCQUENCC CHARACTERISTICS; 
(A) LENGTH: 24 bate pAirt 
(R) TYFEx nucl«lc Acid 
<C) STRANDEDKSSSi lingU 
(D) TOPOLOGY t linaar 

(il) KOLECULE TYPE: CDNA 

<vl) ORZCZNAL SOURCE: 

(A) ORCANISKi Home tapltfnt 



(xi) SEQTONCE DESCRIPTION; SEQ ID NO: 80: 
TACCATCATT TAAAAATCCA CCAC 
(2) INFORMATION TOR SEQ ID WO: 81; 

ii) SEQtJENCE CHARACTERISTICS? 

(A) LENGTH: 23 biit pftlrf 

(B) TYPE: nucleic tcid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; lin«*r 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapien* 



(xi) StgUEHCE DESCRimOK: SEQ ID NOiSl: 
GATGATTGTC TTTTT CC TCT TCC 
(2) INFORMATION TOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pair* 

(B) TYRE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY! linear 

(11) MOLECULE TYPE: CONA 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo aapiena 
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(xi) SEQUENCE OESCRlPTIOK; SEQ ID N0:e2: 
CTCACCTATC TTAACXAATA CATC 
(2) IKTORMATXOH FOR SIQ ID NO: 83; 

(i) SEQOENCE CHARACTERISTICS: 
(A> iXKCTHi 25 b»M P*^^^* 

(B) TrPE: nucleic ACid 

(C) STRAKDEDNESS: •ingle 
<D) TOPOLOGY: linear 

(ii) HOLECULE TTVti cDMA 

<vi) ORIGINAL SOURCE: 

<A) ORGANISM: Homo »tpien« 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0j«3: 

TTTTAAATCA TCCTCIATXC TCTAT 

(2) INFORMATION TOR SEQ ID NO: $4: 

(i) SEQOENCE CHARACTERISTICS! 
(A) tEKCTBi 24 &»•« p*ir» 
(8) TEPE: nucleic »cid 
(C) STRAKDEDNESS: •tngle 
(0) TOPOLOGY: lin««r 

(11) MOLECULE TTPE: cONA 

/vi) ORIGINAL SOURCE: 

(A) organism: Homo •*pt«ne 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0ifi4: 
ACAGACTCAG ACCCT6CCTC AAAC 
(2) INFORMATION TOR SEQ ID KOiS&t 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 bA©e pairs 

(B) TYPE: nuclaic *cld 

(C) STRAKDEDNESS : iingle 

(D) TOPOLOGY: lincJir 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGAKISK: Hofflo sapicn* 



<xi) SEQUKNCr OtSCRIPTIONi SEQ ID N0:a5: 

rrrcTATTcr tactcctacc att 
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(2) INFORKXTIOK FOR SCO I© NO: 86: 

(i) SKOOEMCE CHARACTERISTICS: 
(A) LENGTH: 22 b»»e p«iri 
(ft) TYPE: nucltic acid 

(C) STRANDEDNESS: txnglt 

(D) TOPOLOCy: llnc»r 

(li) MOLECULE TYPE: CDNA 

(vl) ORICIHAL SOURCE: 

(A) ORCANISMi Homo •api«n» 



ixL) SEOUEHCE DESCRIPTION: SEQ ID NO: 86: 
ATACACAGGT AACAAATTAC CA 
(2) INFORKATION FOR SEQ ID NO: 87: 

(i) SE0U3KCE CHARACTERISTICS: 

(A) LENGTH: 22 pAiri 

(B) TYPEi nucX«lc «cid 

(C) STRANDEDNESS: iingle 
(0) TOPOLOGY; lin«4r 

(ii) MOLECULE TYPE: cDNA 

(vi) ORICIHAL SOURCE: 

(A) ORGANISM: Ho<no sapiens 



(Xi) SEQUENCE DESCRIPTION: SZQ ID NO:87; 
TAOATCACCC ATATTCTGTT TC 
(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH! 22 base pairs 

(B) TYPEi nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: COHA 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM; Bono sapiens 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: 
CAATTAGCTC TTTTTGACAC TX 
(2) INFORMATION FOR SEQ TO NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: tingle 

(D) TOPOLOGY: linear 
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(ii) MOLECULE type: cDNA 

(vt) ORICTKAL SOURCE: 

(A) ORCAHISM: Homo saplene 

(xi) SEQUENCE DESCRIPTION; 5EQ ID W:8S: 
CTTACTGCXT ACACATTCTC AC 
(2) IKFORKATIOH FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 bae* pilre 

(B) TYPE: nucleic acid 

(C) STRANOEOKESS : aingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo lapiene 

(xi) SEQUENCE DESCRIPTION: SEQ ID MOr90: 
OCTTTTTCTT TCCTAACATG AAG 
<2) INFORMATION FOR SEQ ID NO: 91: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base paxr« 

(B) TYPE: nucleic acid 

(C) STRANDEDKISS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:91: 
TCTCCCACAC GTAATACTCC C 
(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 bao© pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: eingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapi«n« 
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(xi) SEOOENCE OESCRIPTIOW: SZQ ID KO:92: 

CCTACAACTG XATCCCCTAC C 2: 

(2) IHrORMATIOK FOR SEQ 10 NO:93: 

(i> SEOOEKCE characteristics: 
<A) LENGTH: 22 b«6e ptiri 

(B) TYPE: nucleic tcid 

(C) 6TRANDEDNESS: tingle 
<D) TOPOLOCYs lln«4r 

(ii) MOLECUtE TYPE: CDNA 

(vi> ORICIKAL SOURCE: 

<A) ORCAKISM5 Homo iapi^nt 



(xi) SCOUENCE DESCRIPTIOK: SEQ ID HO: 93: 
CXCCACAAAA TAATCCTOTC CC 
(2) INFORMATION FOR SEQ ID HO:94i 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 24 t>«0e pairs 

(B) TYPES nuei«ic 4ci<l 

(C) STRAKDEDKESS: sin?!* 

(D) TOPOLOGY: lin^^r 

<il) MOLECtaE TYPE: COHA 

<vi) ORIGINAL SOURClr 

(A) ORGANISM: Homo sapisn* 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
AT I IT C I I AC TTTCATTCTT CCTC 



24 
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CLAIMS 

1. A method of diagnosing or prognosing a neoplastic tissue 
01 a human, comprising: 

detecting somatic alteration of wild-type APC gene cod- 
ing sequences or their expression products in a tumor tissue isolated 
from a human, said alteration indicating neoplasia of the tissue. 

2. The method of claim 1 wherein the expression products 
are mRNA molecules. 

3. The method of claim 2 wherein the alteration of 
wild-type APC mRNA is detected by hyDridization of mRNA from said 
tissue to an APC gene probe. 

4. The method of claim 1 wherein alteration of wild-type 
APC gene coding sequences is detected by observing shifts in 
electrophoretic mobility of single-stranded DNA on non-denaturing 
polyacrylamide gels. 

5. The method of claim l wherein alteration of wild-type 
APC gene coding sequences is detected by hybridiiation of an APC 
gene coding sequence probe to genomic DNA isolated from said tissue. 

6. The method of claim 5 further comprising: 

subjecting genomic. DNA isolated from a non-neoplastic 
tissue of the human to Southern hybridization with the APC gene cod- 
ing sequence probe: and 

comparing the hybridizations of the APC gene probe to 
said tumor and non-neoplastic tissues. 

7. The method of claim 5 wherein the APC gene probe 
detects a restriction fragment length polymorphism. 

8. The method of claim 1 wherein the alteration of 
wUd-type APC gene coding sequences is detected by determining the 
sequence of aU or pan of an APC gene in said tissue using a polymerase 
chain reaction, deviations in the APC sequence determined from that 
of the sequence shown in Figure 7 (SEQ ID NO.: 1) suggesting neoplasia. 

9. The method of claim I wherein the alteration of wild- 
type APC gene coding sequences is detected by identifying a mismatch 
between molecules (l) an APC gene or APC mRNA isolated from said 
tissue and (2) a nucleic acid probe complementary to the human wild- 
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type APC gene coding sequence, when molecules ll) and (2) are hybrid- 
ized to eacn otncr to form a duplex. 

10. The method of claim 5 wherein the APC gene probe 
hybridizes to an cxon selected trom the group consisting of: (l) 
nucleotides 822 to 930; and (2) nucleotides 931 to 1309: (3) nucleotides 
1406 to 1545: and (4) nucleotides 1956 to 2256. 

11. The method of claim 1 wherein the alteration of wild- 
type APC gene coding sequences is detected by amplification of APC 
gene sequences in said tissue and hybridization of the amplified APC 
sequences to nucleic acid probes which comprise APC sequences. 

12. The method of claim 1 wherein the alteration of 
wild-type APC gene coding sequences is detected by molecular cloning 
of the APC genes in said tissue and sequencing ail or part of the cloned 
APC gene. 

13. The method of claim 1 wherein the detection of alter- 
ation of wild-iype APC gene coding sequences comprises screening for 
a deletion mutation. 

14. The method of claim 1 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
a point mutation. 

15. The method of claim 1 wherein the detection of alter- 
ation of wild- type APC gene coding sequences comprises screening for 
an insertion mutation. 

16. The method of claim 1 wherein the tumor tissue is a 
colorectal tissue. 

17. The method of claim 6 wherein the non-neoplastic tissue 
isolated from a human is from colonic mucosa. 

18. The method of claim l wherein the expression products 
are protein molecules. 

19. The method of claim 18 wherein the alteration of 
wild-type APC protein Is detected by immunoblotting. 

20. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by immunocytochemistry. 
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21. Tbe method of claim 18 wherein the alteration of 
wilchtype APC protein Is detected by assaying lor binding interactions 
between APC protein of said tumor tissue and a second cellular protein. 

22. The method of claim 21 wherein the second cellular pro- 
tein is selected from the group consisting of MCC protein, wUd-iype 
APC protein, and a C protein. 

23. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by assaying for phospholipid 
metabolites. 

24. A method of supplying wild-type APC gene function to a 
cell which has lost said function by virtue of a mutation in an APC 
gene, comprising: 

introducing a wild-type APC gene into a cell which hzs 
lost said gene function such that said wild- type APC gene is expressed 
in the cell, 

25. The method of claim 24 wherein the wild-type APC gene 
introduced recombines with the endogenous mutant APC gene present 
In the cell by a double recombination event to correct the APC gene 
mutation. 

26. A method of supplying wild- type APC gene function to a 
ceU which has altered APC function by virtue of a mutation in an APC 
gene, comprising: 

introducing a portion of a wild-type APC gene into a ceU 
which has lost said gene function such that said portion is expressed in 
the cell, said portion encoding a part of the APC protein which is 
required for non-neoplastic growth of said cell. 

27. A method of supplying wild-type APC gene function to a 
cell which has altered APC function by virtue of a mutation in an APC 
gene, comprising: 

applying human wild-type APC protein to a cell which has 
lost wild-type APC function. 

28. A method of supplying wild-type APC gene function to a 
cell which has altered APC gene function virtue of a mutation in an 
APC gene, comprising: 
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introducing into the cell a molecule which mimics the 
function of wild-type APC protein. 

29. A pair of single stranded DNA primers for determination 
of a nucleotide sequence of an APC gene by polymerase chain reaction, 
the sequence of said primers being derived from chromosome 5q band 
21, wherein the use of said primers In a polymerase chain reaction 
results in synthesis of DMA having all or part of the sequence shown in 
Figure 7. 

SO. The primers of claim 29 which have restriction enzyme 
sites at each 5* end. 

31. The pair of primers of claim 29 having sequences corre- 
sponding to APC introns. 

32. A nucleic acid probe complementary to human wild-type 
APC gene coding sequences. 

S3. The nucleic acid probe of claim 31 which hybridizes to an 
exon selected from the group consisting of: (1) nucleotides 822 to 930; 
and (2) nucleotides 931 to 1309; (3) nucleotides 1406 to 1545; <4) 
nucleotides 1956 to 2256. 

34. A Wl for detecting alteration of wild-type APC genes 
comprising a battery of nucleic acid probes which in the aggregate 
hybridize to all nucleotides of the APC gene coding sequences. 

35. A method of detecting the presence of a neoplastic tissue 
in a human, comprising: 

detecting in a body sample isolated from a human alter- 
ation of a wild-type APC gene coding sequence or wiid-tjpc APC 
expressio:- product, said alteration indicating the presence of a 
neoplastic tissue in the human. 

36. The method of claim 35 wherein said body sample is 
selected from the group consisting of serum, stool, urine and sputum. 

37. A method of detecting genetic predisposition to cancer, 
including familial adenomatous polyposis (FAP) and Gardner's Syndrome 
(GS), in a human comprising: 

detecting a germline alteration of wild-type APC gene 
coding sequences or their expression products in a human sample 
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selected from the group consisting of blood ancJ fetal tissue, said alter- 
ation indicating predisposition to cancer. 

38. The method of claim 37 wherein the expression products 
are mRNA molecules. 

39. The method of claim 38 wherein the alteration of 
wild-type APC mRNA is detected by hybridlzaiion of mRNA from said 
tissue to an APC gene probe. 

40. The method of claim 37 wherein alteration of wild-type 
APC gene coding sequences Is detected by observing shifts in 
electPophoreUc mobility of single-stranded DNA on non-denaturing 
polyacrylamide gels. 

41. The method of claim 37 wherein alteration of wild-type 
APC gene coding sequences is detected by hybridization of an APC 
gene coding sequence probe to genomic DNA isolated from said tissue. 

42. The method of claim 41 wherein the APC gene coding 
sequence probe detects a restriction fragment length polymorphism. 

43. The method of claim 37 wherein the alteration of 
wild-type APC gene coding sequences is detected by determining the 
sequence of all or part of an APC gene in said tissue using a polymerase 
chain reaction, deviations in the APC sequence determined from the 
sequence of figure 7 suggesiinf predisposition to cancer. 

44. The method of claim 37 wherein the alteration of wDd- 
type APC gene coding sequences is detected by identUying a mismatch 
between molecules (1) an APC gene or APC mRNA Isolated from said 
tissue and (2) a nucleic acid probe complementary to the human wUd- 
type APC gene coding sequence, when molecules (1) and (2) are hybrid- 
ized to each other to form a duplex. 

45. The method of claim 41 wherein the APC gene probe 
hybridizes to an exon selected from the group consisting of: 
(1) nucleotides 822 to 930; and (2> nucleotides 931 to 1305; (3) 
nucleotides 1406 to 1545 and (4) nucleotides 1356 to 2256. 

46. The method of claim 37 wherein the alteration of wild- 
type APC gene coding sequences is detected by amplification of APC 
gene sequences in said tissue and hybridization of the amplified APC 
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sequences to nucleic acid probes which comprise _APC gene coding 
sequences, 

47. The method of claim 37 wherein the alteration of 
wild-type APC gene coding sequences is detected by molecular cloning 
of the APC genes in said tissue and sequencing all or part of the cloned 
APC gene, 

48. The method of claim 37 wherein the detection of alter* 
ation of wild-type APC gene coding sequences comprises screening for 
a deletion mutation. 

49. The method of claim 37 wherein the detection of alter- 
ation of wUd-type APC gene coding sequences comprises screening for 
a point mutation. 

50. The method of claim 37 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
an insertion mutation. 

51. The method of claim 37 wherein the expression products 
are protein molecules. 

52. The method of claim Sx wherein the alteration of 
wUd-type APC protein is detected by Immunoblotting. 

53. The method of claim 51 wherein the iteration of 
wild-type APC protein Is detected by immunocytochemistry. 

64. The method of claim 51 wherein the alteration of 
wild-type APC protein is detected by assaying for binding interactions 
between APC protein isolated from said tissue and a second cellular 
protein. 

55. The method of claim 54 wherein the second cellular pro- 
tein is selected from the group consisting of MCC protein, wild-type 
APC protein and a C protein. 

56. A method of screening for genstic predisposition to can- 
cer, including familial adenomatous polyposis (Fap) and Gardners Syn- 
drome (GS), in a human comprising: 

detecting among kindred persons the presence of a DNA 
polymorphism which linked to a mutant APC allele in an individual 
having a genetic predisposition to cancer, said kindred being 
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genetically related ro the individual, the presence of said polymorphism 
suggesting a predisposition to cancer. 

57. A preparation of the human APC protein substantially 
free of other human proteins, the amino acid sequence of said protein 
corresponding to that shown In Figure 3 or 7 (SEQ ID NO: 1). 

58. A preparation of antibodies Immunoreactive with a 
human APC protein and not substantially immunoreactive with other 
human proteins. 

59. A method of testing therapeutic agents for the ability to 
suppress a neoplastlcally transformed phenotype, comprising: 

applying a test substance to a cultured epithelial cell 
which carries a mutation in an APC allele; 

determining whether said test substance suppresses the 
neoplastlcally transformed phenotype of the cell. 

€0. The method of claim 59 wherein the cultured epithelial 
cell has been genetically engineered to carry the mutation in the APC 
allele. 

61. A method of testing therapeutic ^^ents. for the ability to 
suppress neoplastic growth, comprising: 

administering a test substance to an animal which carries 
a mutant APC allele in its genome; 

determining whether said test substance prevents or sup- 
presses the growth of tumors. 

62. A transgenic animal which carries a mutant APC allele 
from a second animal species in its genome. 

63. An animal which has been genetically engineered to con- 
tain an insertion mutation which disrupts an APC allele in its genome. 

64. A cDNA molecule which encodes a protein having the 
amino acid sequence shown in Figure 3 or 7 (SEQ ID NO: 7 or 1). 

65. An isolated DNA molecule which encodes a protein having 
the amino acid sequence shown in Figure 3 or 7 (SEQ ID NO: 7 or 1). 

66. A yeast artificial chromosome which is known as 37HG4. 
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TABLE nA 

Germline mutations of the APC fene in FAP and GS Patients 



NUCLEOTIDE AMLVO 

EXTRA-COLONIC 
PATIENT CODON CHANGE 
DISEASE 



91 
Of t»oas 
2i 
2A 
Tunox 
21 
Ostceaa 

CO 
Oit«e«« 



279 

301 
301 

413 

712 



TCA->TCA 



CG*->TCA 
CGA->TCA 



CCC->TCC 



TCA->TCX 



Arg->St©p 



ACID 



CHANGE AGE 



$«r->SCop 39 NudUalar 



46 Mont 
27 Ocsaold 

24 NiadibuUx 



St7->Stop 31 lUndlbuXar 



37** 343 CACAC->CAC . «pllc»-junccien 

301 CSA->TC* Arj->£tof 

3827 435 CTTTCA->CrrCA friMshlft 

3712 500 T.>C Tyr->Stop 

* The nutated nucleotides are underlined. 
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TABLE HB 



Somatic Muationj in Soondic CRC Micno 

nS MCC 12 GAC/|tU|a-> (Splic* Doaor) 

CAC/foiia 

TI6 MCC M5 etcit/GOA-> (SpHct Accepror) 

|tc«|/CCA 

T47 MCC 267 CGG->CIG ArtoUu 

Til MCC 490 tCO->TIC SeroUu 

T3S MCC 50(5 CGC->CAC An->Cla 

T9J MCC 691 CCT->CTT AUoVtl 
114 A^2lt CCACT->CCCAGCCAGT (IftMniOD) 

T27 APC33J CGA.>IGA Ar|->Slop 

tlJ3 APC43? CAA/|Ui.>CAA/i£aa (Splict Donor) 

12QI APCJ33I CaG->IAG Clo->Stop 



for spike sice muutio&i. the eodon nearest to the muuiioo is listed 

The underused nucleotides muu&t; small cise letten rcprev&t introru, Urfe ctu letten represent exou 
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TABLE in 



S^OutncM of Pf»m«a Umo tor SSCP AMt%m 



811 



2 
I 
} 
« 
f 



rriMr a 



wit 



1 

2 
} 
« 
f 




'AAACCMAUflfrCtfT 



iiUiJfJillil 




-I 



^*r?TriiTifTirh y rtiiu ■ a^<mnm7MmMcmM9t 

>^ni I iiiMii I mil uui 

nil I.ITTAfitfTUUa* ■■■ .^.^..^.f^gj,^- 

^ W>"*fcr i i i n IT i. I UX Hi if i^iTCJktfTOOOCTCTCCTtUX^ 

"« ■^•AtCToceTrcjujuOTffm' » i ci i i Ti i.uKiup i> 

w M I "n Ml jr II M liu, lie* v-i lui jcjAoccTrmflorrv* 



•t fl ■ MHLU m44444 nWii * 
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TABLE rV 



S«v«n Oiffertnt Wtniens of lf\t 2(VAmino Acid Aap««( 

1282: YCVEOTPI CFSACSSISSIS 

1378: HYVOeTPLMFSflCTSVSSLO 

1492: FATESTPOOfSCSSSLSALS 

IW: YCVgGTPI NPSTATSLSOLT 
TP! EGTPYCFSfiNOSLSSLO 

19S3: fAI ENTPVCPSHNSSLSSLS 

20^3: fMVgOTPVCFSRNSSLSSLS 



Num6«rs dtnets (^« first ammo zac et «ae.*i rBoaL Tha eonMAaus 
ae^enea at :n« trp rt^it«:i a majority ammo a;:tf u a gtvtn pestupn. 
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► 8CH— t 



I 



B ^ 



Contigl 



YAC» 



tits 

f 5 • 



1SO<12(70*] 401 HAM 



Contig 2 



Maric«a tia 



YACi 



«D6linZXk*t 



IPC 



ContigS 



»9D 



YACs 



^^^^^ M <tH»M I 



0 tS 9 K^3IGa^l 



HGUPE 1 



2/11 

WO 92/13103 PCr/US92/00376 

TBI Amino Acid Sequence 



VAPVWGS6R AFRNPAFAAM HPRRP06FDG LGYRGGAROE Q6F6GAFPAR SFSrSSOLGN 60 

WVTTPP0IP6 SRNLHWSEKS PPY6VPTTST PYEGPTEEPF SS6666SVQ6 QSSEOLNRFA J20 

GFGIGLASLF TENVLAHPCI VLRRQCQVMY HAQHYHLTPF TVINIHYSFN KTQGPRALWK IfiO 

GHGSTFIVQG VTL6AE6IIS EFTPLPREVL HICWSPKQI6E HUUSLTYV VAHPFYSASL 240 

lETVQSEIIR DHTGILECVK E6IGRVI6MG VPHSKRLLPL LSLIFPTVLH GVLHYIISSV 300 

IQKFVLLXLK RICTYKSHUE STSPVQSMLO AYFPELIANF AASL CSDVIL YPLETVLHpt 360 

K|Q6TT^TIID tITPLfiYgVLP IWTQYEGMRO CIKTIRQEEG VF6FY1CGFGA VIlOnLHAA 420 

VLQITKIXYS TUQ . AJi 

T62 Amino Acid Sequence 

ELRRFDRFLM EKNCKTOLU iCLEAICTGVNR SFIAL6YIGL VALYLVF6YG ASLLCNLI6F 60 

6YPAYISIICA ICSPNKEOOT QWLTYWYYYG VFSUEFFSO IFLSHFPFYY IUCC6FLLWC 120 

KAPSPSNGAE LLYICRIIRPF FUCHESQKOS WKDUCDKAX ETAOAITKEA KKATVKLLGE 180 

ElOCST 185 
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HAAASYOQLL 
ASSGQIOLLE 
MFVKGSftES 
TOMTRRQLEY 
AERSSQNKHE 
RRLTSHL6TK 
CNOiCOSVLLfi 
EUQEAHEP6H 
OCEMYGLTNO 
QVMSVLRNL 
NKAOICAVDG 
LQTLLQHUCS 
SAAALRNLKA 
aSHRSKQRH 
DSSRSEKORS 
E0RSS6STTE 
0SU4SVSSS0 
OTPINYSUCY 
TOOKHLXFQP 
NYSERYSEEE 
SGQSSICTEHM 
YCVEOTPICF 
EVFAVSQHPR 
LMFSRCTSVS 
PPQTAQTKRE 
CSSSLSAUL 
OLLDOSOOOO 
NRLQPQKHVS 
GEFEICROTIP 
RVKKIMOQVQ 
ERVFSDNKDS 
LSSLOFDOOO 
QPKPILQKQS 
KCNEPZKETE 
CISSAMPnX 
KAXQEGANSI 
KIC6PRILICP6 
PSISRGRTKI 
LSPVARQTSQ 
LSQLPRTSSP 
NQHNKGNGAN 
FESLSPSSRP 
PAKRHOIARS 
EKAKSEOEKH 
TLIYQHAPAV 
AKQNVGNGSV 
FSSSSSSICHS 
STESSGTQSP 



KQVEALKMEN SNLRQELEON SNHLTKLETE ASNHKEVLKQ LQGSIEOEAM 60 
RUELNLOSS NFPGVKLRSK MSLRSYGSRE 6SVSSRSGEC SPVPMGSFPR 120 
TGYLEELEKE RSLLLAOLOK EEKEKOWYYA QLQKLTICRIO SLLTENFSLQ 180 
EARQIRVAHE EQLGTCQOHE KRAQRRIARI QQIEmiLRI RQLLQSQATE 240 
TGSKOAERQN EGQGV6EXNH ATS6NGQGST TRNOHETASV LSSSSTHSAP 300 
VEMVYSLLSH LGTHDKOOMS RTLLAHSSSQ DSCISMRQS6 CLPLLIQLLH 360 
KSRGSKEARA RASAAUIMII HSQPODKRGR REIRVLHLLE QIRAYCETCW 420 
OQDiCNPMPAP VEHQZCPAYC VLMCLSFOEE HRHAKNEIGG LQAIAELLQV 430 
HYSITLRRYA GNALTNLTFG OVANKATLCS MCGCMRALVA QUSESEDLQ 540 
SWRAOVNSICK TLREVGSVKA UfECALEVKK E$TUCSVLSA LWNLSANCTE 600 
AUFLVGTLT YRSQTNTUZ IES6GGILRN VSSLIATNEO HRQILRENNC 660 
HSLTIVSNAC GTLWNLSARN PKDQEALWDM GAVSNLKNLI HSKHICNIAN6 720 
NRPAKYKOAN INSPGSSLPS LHVRKQKAIE AELOAQKLSE TFONZONLSP 780 
KQSLYGOYVF OTNRNDONRS DNFNTGNKTY LSPYLNTTVL PSSSSSRGSL 840 
LERER6IGLG NYHPATENPG TSSKRGLQIS nAAOXAKVN EEVSAIHTSQ 9O0 
LHCVTOERNA LRRSSAAHTH SNTYNFTKSE NSNRTCSNPY AKLEYKRSSN 960 
GYGKRGQHICP SIESYSEOOE SKFCSYGQYP AOUHKIHSA NHHOONOGa 1020 
S0EQLNS6RQ SPSONERUAR P1CHIIE0EIK QSEQRQSRNQ STTYPVYTES 1080 
HFGGQECVSP YRSRGANGSE TNRVGSNHGZ NQNVSQSLCQ EDOYEODKPT 1140 
QHEEEERPTN YSIKYNEEIS HVDQPIOYSL KYATOZPSSQ KQSFSFSICSS 1200 
SSSSENTSTP SSNAKRGMQL HPSSAQSRSG QPOKAATOCV SSZNQETIQT 1260 
SRCSSLSSLS SAEOEZGCNQ TTQOPOSANT LQIAEZKEKZ 6TRSAE0PVS 1320 
TKSSRLQ6SS LSSESARHICA VEFSSGAICSP SKSGAQTPKS PPEHYVQETP 1380 
SUSFESRSI ASSVQSEPCS GNVSGZXSPS DLPOSPGQTM PPSRSKTPPP 1440 
VPKNKAPTAE KRESGPKQAA VNAAVQRVQV LPOAOTUHF ATESTP06FS ISOO 
OEPFZQXDVE LRZMPPVQEN ONGNETESEQ PKESNEMEi: EAEKTZOSEIC 1560 
ZEZLEECXXS AHPTKSSRKA OPAQTASKL PPPVARICPSQ LPVYKLLPSQ 1620 
FTPGOOHPRV YCVEGTPZNF STATSLSOLT ZESPPNELAA GE6VR6GA0S 1680 
TE6RST0EAQ 6GCTSSYTZP ELDONKAEEG DZUECZNSA HPKGKSKKPF 1740 
OASASSSAPN KNQLDGiaCXK PTSPVKPZPQ NTEYRTRVRK NADSKNNLNA 1800 
KKQNLKNNSiC OFNOKLFNNE ORVRGSFAFD SPHNHPZEG TPYCFSRNOS 1860 
VDLSREKAEL RXAXEMCESE AKVTSHTELT SNQQSAMCTQ AZAICQPINRG 1920 
TFPQSSKDZP ORGAATDEXL QNFAZENTPV CFSHKSSLSS LSOXDQENNN 1980 
PPOSQGEPSK PQASGYAPKS FHVEDTPVCF SRNSSLSSLS ZOSEOOLLQE 2040 
KPSRLXGONE KHSPRNHGGZ IGEOLTLOUC OZQRPDSEHG LSPOSENFOV 2100 
VSSIHQAAAA ACLSRQASSO <i)SZLSUS6 ZSLGSPFHLT PDQEEJCPFTS 2160 
EKSTLETXXZ ESESKGZKGG KXVYXSLXTG KVRSNSEZSG QKKQPLQANH 2220 
HZPGVRHSSS STSPVSIOCGP PUCTPASKSP SEGQTAHSP RGAJCPSVJCSE 228G 
XGGSSKAPSR SGSRDSTPSR PAQOPLSRPX QSP6RNSXSP GRNGISPPNK 2340 
STASTKSSGS GKMSYTSP6R QMSQQNLTKQ TGL$KNAS$Z PRSESASKGL 24O0 
OCVELSRMSS TKSS6SE5DR SERPVLVRQS TFIKEAPSPT LRRKLEESAS 2460 
ASPTRSQAQT PVLSPSLPOH SLSTHSSVQA 6GWRKLPPNL SPTZEYNDGR 2520 
HSESPSRLPZ NRSGTVKREH SKHSSSLPRV STURRTGSSS SZL$ASSESS 2580 
VNSXSGTKQS KENQVSAJCGT URKIKENEFS PTNSTSQTVS SGATNGAESK 2640 
SKTEDWVRZ EOCPZNNPRS 6RSPTGNTPP VZOSVSEKAK PHXKDSKDNQ 2700 
PHRTVGLENR LNSFIQVOAP OQKGTEIICPG QNNPVPVSCT NESSIVERTP 2760 
SPS6TYAARY TPFKYNPSPR KSSADSTSAR PSQIPTPVNN NTKKROSICTO 2820 
KRHSGSYLVT SV 28« 



FlGUPr 3 
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APC 203 LGTCQDKEKRAQKRIARZQQIEKDILRZfiQL 233 

I :: 11111111:1 I I 
ralZ 576 LTGAKGLQLRAUlRIAfilEQGGTAISPTSPL 606 



B 

APC 4S3 HKLSFOEEHRMAmELGGLQAIAELLQVO 481 

: I : 11:1111: : : 

n3 KAChR 249 LYWRXY1CETEKRTKELA6LQASGTEAETC 277 
il : . i : lllill 
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